Releases: elixir-crawly/crawly
Releases · elixir-crawly/crawly
0.15.0 Release
What's Changed
- Created a simple management Web UI. Try it on localhost:4001
- Added the possibility of creating spiders with the help of the YML format. Read more here: https://github.com/elixir-crawly/crawly/blob/master/documentation/spiders_in_yml.md
- Added the possibility to run Crawly (and your scraping projects) without Elixir. Read more here: https://github.com/elixir-crawly/crawly/blob/master/documentation/standalone_crawly.md
- Added generators for Crawly spiders and configuration files to reduce boilerplate
- Improved UniqueRequest middleware so that it can store hashes instead of complete URLs (special thanks to @serpent213)
- Added SameDomainFilter middleware, my favorite, which will probably deprecate the need to rely on the base_url in the future. Again thanks to @serpent213!
Release 0.13.0
- Bugfix for start_urls size (now it's possible to have very large start URLs)
- Split business logs from other logs. Per spider logging
- Send logs to CrawlyUI (optional)
- Allow to override more spider options:
- closespider_itemcount
- closespider_timeout
- concurrent_requests_per_domain (number of started workers)
- Change on_spider_log_callback (now it also gets the crawl_id)
- Parse pipelines
0.12.0
Update gollum so we can use new HTTPoison (#139)
Release 0.11.0
Update version to 0.11.0
Release 0.10.0
The release includes the following improvements:
- WriteToFile pipeline now adds timestamps to filenames
- WriteToFile pipeline will now create a folder if missing
- SendToUI item pipeline will send data to experimental CrawlyUI management dashboard
- Other smaller features
Release 0.9.0
This release contains the following features:
Automatic cookies management (allows scraping websites under login form or a form with ZIP code)
Spider custom settings (allows overriding settings like concurrency on the spider level)
Injected on_spider_closed_callback (allows notifying other parts of the system on the crawl end)
Fixes and improvements of the documentation
Release 0.8.0
This release contains the following features:
- Retries support
- Pluggable user agents
- Browser rendering support