Trying to explore using Tabula in python on a PDF in Visual Studio code on MacOS.
import pandas as pd
import tabula
dfs = tabula.read_pdf("/Users/TEST.pdf", pages = 1)
len(dfs)
When I run the code however I get the following error:
FileNotFoundError: [Errno 2] JVM DLL not found: /Library/Java/JavaVirtualMachines/adoptopenjdk-11.jdk/Contents/Home/lib/jli/libjli.dylib
I have installed Java via home-brew, and via a pkg all apparently successful and can run a simple java program in visual studio code just fine. So it is installed, but I don't really know how to solve the above error despite a few attempts.
I am really new to python and installing packages so if you think you can answer, please walk me through like I'm 5 years old.
UPDATE:
import os
# Set the JAVA_HOME environment variable to the Java installation directory
os.environ["JAVA_HOME"] = "/opt/homebrew/opt/openjdk/libexec/openjdk.jdk"
import pandas as pd
import tabula
dfs = tabula.read_pdf("/Users/NickCoding/Desktop/TEST.pdf", pages = 1)
len(dfs)
This allows the code to work, however I feel that this is a botched solution.
How do I get it to work in the virtual environment?
JAVA_HOME
for your system