All Questions
8 questions
1
vote
3
answers
3k
views
Extract entire textual data from Edgar 10-K using python
I am trying to extract entire textual data from the given URL below as an example. I have many URLs so automating. I tried every code posted here - they are giving error, eg AttributeError: 'NoneType' ...
0
votes
2
answers
1k
views
Extract business description (Item 1) of multiple firms from their 10-K reports
I am trying to extract business descriptions of multiple firms from their 10-K reports using the R package, edgar. I am using getBusinDescr function to do so. However, I am only able to extract Item 1 ...
3
votes
1
answer
2k
views
Parse XML with Python lxml
I am trying to parse a XML using the python library lxml, and would like the resulting output to be in a dataframe. I am relatively new to python and parsing so please bear with me as I outline the ...
2
votes
1
answer
2k
views
Extracting table of holdings from (Edgar 13-F filings) TXT (pre-2013) with python
I am working on extracting a table of holdings from 13-F form on EDGAR. Before 2013 holdings were given in a txt file (see example).
The output I am aiming for is a pd.DataFrame with same shape as the ...
0
votes
1
answer
720
views
Saving SEC 10-K annual report text to files (trouble with decoding)
I am trying to bulk-download the text visible to the "end-user" from 10-K SEC Edgar reports (don't care about tables) and save it in a text file. I have found the code below on Youtube, however I am ...
1
vote
0
answers
679
views
Count keywords in SEC Edgar 10-K filings text-body with Python
I am trying to parse the text section of the SEC Edgar texts in Python 3, e.g.: https://www.sec.gov/Archives/edgar/data/796343/0000796343-14-000004.txt
My goal is to collect the number of occurrences ...
5
votes
1
answer
441
views
SEC company filings: Is the <SEC-HEADER> tag valid SGML? If so, how to parse it?
I tried to parse SEC company filings from sec.gov. Starting from fb 10-Q index.htm let's look at a complete text submission filing like complete submission text filing. It has a structure like:
<...
3
votes
0
answers
734
views
How would I approach a lot of structured-but-inconsistent data? [closed]
I'm attempting to parse EDGAR documents - they're SEC filings. Specifically, I'm attempting to parse both SEC Schedule 13D and Schedule 13G filings.
There appears to be lots of failed attempts at ...