Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

html2text does not work with uppercase html tags #23

Open
tobiase opened this issue Jul 6, 2021 · 0 comments
Open

html2text does not work with uppercase html tags #23

tobiase opened this issue Jul 6, 2021 · 0 comments

Comments

@tobiase
Copy link

tobiase commented Jul 6, 2021

print (new Html2Text('<P>Test string</P>'))->getText();

prints nothing while

print (new Html2Text('<p>Test string</p>'))->getText()

prints Test string as expected.

The reason for that is in \voku\Html2Text\Html2Text::pregCallback.
$matches['element'] is initially converted to lowercase with

$element = \strtolower($matches['element']);

but the lowercase version is not used in the switch statement or to match headings:

protected function pregCallback(array $matches): string
    {
        // init
        $element = \strtolower($matches['element']);

        switch ($matches['element']) { // Case sensitive
            case 'p':
                // Replace newlines with spaces.
                $para = \str_replace("\n", ' ', $matches['value']);

                // Add trailing newlines for this paragraph.
                return "\n\n" . $para . "\n\n";
                ...
                ...
        }

        // h1 - h6
        if (\preg_match('/h[123456]/', $matches['element'])) {  // Case sensitive
            return $this->convertElement($matches['value'], $matches['element']);
        }

I don't understand why pregCallback returns an empty string as last line.
Shouldn't the code always know how to handle the $matches['element'] that ends up in pregCallback?
Wouldn't it be better to throw an exception in case there is no handling available for a match $matches['element']
or return $matches['value'] ?? ''?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant