1

I want to extract some data on web and I am using web scraper developer tool provided by chrome. My web pages contain a sections in which each product details(details about graphic card, processor, display etc) are listed. But each section contains many rows, and these rows positions are not fixed. If it was described using TR and TD tags, then I can apply a conditions like this( eg : tr:contains('Prozessortyp') td.value) This condition confirms that , if the row value is " Prozessortyp " only then take the corresponding td value.

But the website I am extracting has described using dd and dt tags. I will attach the details screenshots of one particular section. enter image description here

enter image description here

When I want to select first row in this section, the selector it gives in web scraper is section:nth-of-type(2) dd:nth-of-type(1) . But how can I put a condition , when the row has a key "Arbeitsspeicher-Typ" then select the value in that corresponding row.

Thank you :)

1 Answer 1

0

CSS selectors are actually able to select siblings of an element. For this use case, you'll want the Adjacent Sibling Selector (+):

dt:contains("Arbeitsspeicher-Typ") + dd
dt:contains("Speichergeschwindigkeit") + dd
...

This should do the trick, assuming the selector is unambiguous within the selector graph. I'd recommend using dl.specification as a parent selector.

If any of the dt elements represents a boolean property that is not easily captured in the text output, for instance when the dd contains an svg checkmark without text:

dt:contains("Validated")

Simply checking for the existence of a dt (omitting the dd sibling selector) can yield the desired information when the presence of the row itself is conditional.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.