198 questions
0
votes
1
answer
102
views
How do I parse this JSON data in C# and would it be more benefical to simply switch over to javascript?
I'm looking to parse this JSON and I've had nothing but problems. The link to the JSON is here. I'm trying to access the "href" field. While writing this up, I realized that that the data ...
1
vote
0
answers
54
views
Scraping infinte scroll - GET url not working
I am using import.io to scrape angel.co and as I usually do when there is an infinite scroll I'd open the devtools, look at the network and get the GET request with the right pagination.
Now when I ...
1
vote
0
answers
78
views
Import.io extract data from Post Request
I a web site that returns parts of web page as response data when a POST request is submitted. I have tried pasting the URL into the extractor, but it returned no data. Is there a way to extract the ...
1
vote
0
answers
179
views
Troubleshooting import.io
I am using import.io to scrape information from various websites for a research project. While it usually does a very good job, it occasionally outputs a blank page within the scraper interface and I ...
0
votes
1
answer
89
views
JSON Query for import.io
I'm using import.io and trying to figure out how to write a code that that uses multiple inputs to run a connector query. I've never used JSON before but essentially what I'm trying to do is to expand ...
1
vote
1
answer
166
views
How to concat two values in one column using import.io
I am writing extractor for newegg.com using import.io. I am facing one difficulty while grabbing price values from listing page.
<div class="item-price-now">
<span>from</span>...
2
votes
1
answer
242
views
JSON Line issue when loading from import.io using Python
I'm having a hard time trying to load an API response from import.io into a file or a list.
The enpoint I'm using is https://data.import.io/extractor/{0}/json/latest?_apikey={1}
Previously all my ...
3
votes
1
answer
83
views
Url regex from data extracted from <script>
I have a problem with proper string recognizing and excluding some trash from a string with URLs extracted from a html
Here is my string:
{"small":"[https://img.classistatic.com/crop/50x50/i.ebayimg....
0
votes
1
answer
392
views
Parse JSON in string format via PHP [closed]
I have a json file from import.io that returns null when decoded, but shows up as a string when encoded and is all there. How can I "loop" through a json string in PHP?
Json data is very lengthy so I ...
1
vote
1
answer
99
views
Does Import.io api support status of the extractor?
I've just created an extractor with import.io. This extractor uses chaining. Firstly I'm extracting some urls from one page and with these extracted urls, I'm extracting detail pages. When detail ...
-1
votes
1
answer
58
views
How would I go about parsing this? (Node.js)
{"extractorData":{"url":"http://mobcrush.com","resourceId":"VALUE","data":[{"group":[{"Userpart value":[{"text":"Galadon"}]},{"Userpart value":[{"text":"ShinKaigan"}]},{"Userpart value":[{"text":"...
0
votes
1
answer
507
views
import.io and portia regex url patterns
I am using data scrapers: Import.io & Portia.
They both allow you to define a regular expression for the crawler to abide by.
for example the url: https://weedmaps.com/dispensaries/pdi-medical
...
1
vote
1
answer
181
views
Replace double slash with single slash in import.io XPath selector
I am using import.io to scrape some pages. I came across a page that uses internal hrefs like this: http://domain.com//Event - notice the double slash after the domain name. From my research, this ...
0
votes
1
answer
731
views
xpath to select full content of the class except one sub child class
<div class="content" style="opacity: 1;">
<div class="A" >
<ul class ="any" >
<div class ="demo" >
<div class="pipe" >
<ul class= "...
0
votes
1
answer
531
views
XPath - Extract spectific file name from string
I'm trying to extract just the filename from a javascript link in import.io, eg googlebolver.htm from href="javascript:finpopup('googlebolver.htm',920,620,0)"
I've managed to get to the 'link' (...
-2
votes
2
answers
85
views
Extracting information using XPaths
Good afternoon dear community,
I have finally compiled a list of working XPaths required to scrape all of the information from URL's that i need.
I would like to ask for your suggestion, for a ...
0
votes
2
answers
146
views
Xpath class id + text
I am trying to scrape permissions tables in the following site: https://register.fca.org.uk/ShPo_FirmDetailsPage?id=001b000000MfaDiAAJ
I am tryinng to find out if xpath is capable of locating a ...
2
votes
2
answers
104
views
Should I use Xpath or regexp for this?
I'm no expert at languages or have any knowledge of it. I'm pulling data from a website that is half dynamic.
For example I need to have 2 columns for "Advising on a home purchase plan - Customer ...
3
votes
2
answers
2k
views
Xpath to select next parent of the current node
if tr contains class="productnamecolor colors_productname" i want to select next tr which contains the price details. so i use :
.//a[@class="productnamecolor colors_productname"]/parent::node()/...
15
votes
2
answers
87k
views
Python requests call with URL using parameters
I am trying to make a call to the import.io API.
This call needs to have the following structure:
'https://extraction.import.io/query/extractor/{{crawler_id}}?_apikey=xxx&url=http://www.example....
1
vote
1
answer
237
views
Listing extractors from import.io
I would like to know how to get the crawling data (list of URLs manually input through the GUI) from my import.io extractors.
The API documentation is very scarce and it does not specify if the GET ...
1
vote
1
answer
2k
views
Xpath to extract from following siblings until the next specified node
I'm trying to extract everything from the p nodes that follow the h2 containing "Summary" until I get to the next h2.
This is what I have so far:
.//h2[contains(text(),'Summary')]/following-...
1
vote
3
answers
180
views
What XPATH I need to extract specific data from Edmunds website?
I am using import.io software to extract data from Edmunds... example page http://www.edmunds.com/bugatti/veyron-164/2009/st-101194582/features-specs/
I emailed to [email protected] few times but ...
1
vote
1
answer
1k
views
XPath to ignore text within an element?
On this website: http://www.yankeecandle.com/browse/candles/jar-candles/_/N-9yf
Using import.io to get data from the page.
I'm looking for an XPath that gets me only the lowest price, so the 10.99 ...
-1
votes
1
answer
154
views
Can't access import.io's new web extractor/dashboard
I'm reading through this articles:
https://www.import.io/post/our-biggest-update-ever/
http://importio.desk.com/customer/en/portal/articles/2402346-import-io-product-list
First one says:
If you ...
1
vote
1
answer
82
views
Same URL scrapes different data when run from settings
I'm trying to scrape some data and when I run New Extractor for first time with this url it returns list of cars exactly what I see. However when I save it and run it from the settings, exactly same ...
1
vote
1
answer
72
views
Change column to HTML type with import.io web extractor
I see several tutorials showing how to change the column type when creating a new extractor with import.io, however none of them seem to match the newest version. I was able to change the column type ...
1
vote
1
answer
123
views
GET API code request failure
I just started learning how to use API and I found some really usefull websites and apps like Postman and import.io yet I'm having problems finishing it without help.
I started my little project by ...
2
votes
1
answer
151
views
How to download all extractors along with the endpoints for RESTful request?
I've been using import.io to extract lots of data from hundreds of web pages. I've already created extractors for those URLs and still adding more.
I've designed an automated process that sends an ...
0
votes
2
answers
917
views
Insert data from API into DOM
I retrieve data from an API and I want to get it in my bootstrap layout.
$.ajax({
url: 'https://api.import.io/store/connector/b5caf0ef-1e6b-4fba-9fa4-21e475196673/_query?input=webpage/url:http%...
0
votes
1
answer
108
views
Getting Bulk Extraction in the import.io API via HTTP in Java
Thing is, I'm using the information on this link combined with some tips about http posting in order to get some data of an Extractor in import.io. It would be MUCH better if I were able to get the ...
-2
votes
1
answer
135
views
Screen Scraping password protected sites
I just have a very general question, but of vital importance to me. May I know if import.io supports screen scraping of password protected sites? If not, could someone suggest some tools that do? ...
0
votes
0
answers
31
views
Formatting a lot of JSON mashed into a legit format [duplicate]
So I'm having this problem about the output of an import.io command. My results' format is the following:
{a line which is a valid JSON by itself}
{a line which is a valid JSON by itself}
{a line ...
1
vote
1
answer
107
views
How to send crawler data to PHP via command line?
Can I send the results rather than stored in the JSON file, send it to PHP?
I have this two files
settings.json
{
"outputFile" : "C:\\wamp\\www\\drestip\\admin\\crawls\\mimshoes.json",
"logFile" : "...
2
votes
1
answer
98
views
URL Redirection in import.io
Hi I am working on URL http://www.goodtoknow.co.uk/recipes/healthy?page=1&cost_range=any&total_time=any&skill_level=any&tags%5B0%5D=Healthy&tags%5B1%5D=Healthy and creating ...
1
vote
1
answer
138
views
import.io display data on wordpress page
I have built an api using import.io to get some data about a product from a url.
I'm new to this and need a bit of help, how to I display the information gathered from a specific url on my wordpress ...
3
votes
1
answer
141
views
Using import.io with mouseover text
Long time viewer, first time poster!
I am having some trouble... I notice that apparently scraping mouseover text is an option when crawling webpages now (http://support.import.io/forums/199278-ideas-...
2
votes
1
answer
91
views
Regex Lookahead in Import.io (IF-Else-Then)
I'm looking for a regex in import.io crawler script.
The text can either contain:
xxx – yyy – zzz
rrr – sss
Or
xxx
yyy
In either case I need the yyy part. So I created the following lookahead ...
3
votes
1
answer
80
views
Import.io returns empty columns for javascript enabled api
I have searched here and couldnt find any answers. Some columns of an import.io api is not returning any data. These data is behind javascript, but, during training it returns data, but, during bulk ...
0
votes
1
answer
186
views
How can i get data from my import.io API?
Im trying to get data from my import.io APi, i want to display this data on my site in an unorderd list .
Can you tell how to do this?
This is what I have so far:
document.addEventListener('...
2
votes
1
answer
137
views
import.io Connector - with Javascript
I'm trying to extract the email from this [page] (http://royalenfield.com/locateus/genuine-parts-distributors/) by choosing country, state, and city. I can't get Connector to cooperate. I enabled ...
4
votes
1
answer
408
views
Import.io > Extractor : page never load, so cannot extract datas
Import.io is working pretty fine, but there is one website I would like to extract datas, but when I start the extractor, then enter the URL http://restaurant.michelin.fr/restaurants/france/75000-...
5
votes
1
answer
138
views
Can I use extractor for local html files
I'm using this article here to load a local html file. I can use extractor to get the data, but I can't publish the API. I'd like to run the API extractor on multiple pages. Is this possible?
4
votes
1
answer
93
views
How to make import.io retrieve data stored
Everytime I run the API in the android app, it runs the query itself and retrieve data from the website instead of the stored data, how do I make it retrieve the data stored to save running time?
2
votes
1
answer
63
views
Need to keep <br> in text block tags while using import.io
Looking to do something relatively straightforward, I'm scraping text which so far I have had no problem grabbing, but I need to keep the <br> tags because white space analysis is an important ...
0
votes
0
answers
723
views
Troubleshoot "Failed to Create the Java Virtual Machine" when launching Import.io
For several months I've been getting this error when trying to launch the Import.io app on Windows 10:
"Failed to create the Java Virtual Machine"
The company assures me that they don't have this ...
1
vote
1
answer
199
views
How to create a web app with data from import.io
I am fairly new to web dev and I know the basics of html, CSS, JavaScript and jQuery, so just front end.
As a challenge to learn more I want to create a web app that does the following:
Get the ...
1
vote
2
answers
315
views
Filemaker Pro and Import.IO. How to import into FMP using Import.IO API?
I have a nice web scraper in Import.IO and I want to set up automatic uploads from Import.IO into Filemaker Pro. I've spent months on this and I have no idea why it does not work.
Here is what I've ...
1
vote
3
answers
587
views
What XPATH I need to extract the text inside SPAN that is preceded by a specific label inside a STRONG, both inside a P?
What XPATH I need to extract the text inside SPAN that is preceded by a specific label inside a STRONG, both inside a P?
For example to extract website and email addresses from a page that looks like ...
2
votes
0
answers
249
views
Import.io Api won't work with android
I have created an Import.io API to replace the kimonoLabs API I had in my application, however as I try to connect, it never manages to, it keeps giving
com.android.volley.RedirectError
any ideas??
...