python
Tags: computers
Context sensitivity: https://groups.google.com/g/comp.theory/c/nJzOyRnAP3k/m/AInZxo48FNIJ?pli=1
Parsing PDF’s in Python
- You can use:
pypdf.
orpypdf4
pdfminer
- also
pypdfparser
- also
textract
-> https://textract.readthedocs.io/en/stable/python_package.htmlpoppler
-> pdf rendering engine?- OCR ->
tesseract
withpytesseract
, preprocess withopencv
tika
.with the pythontika
interface
Difflib
- https://docs.python.org/3/library/difflib.html
- generates diffs
- has a series of diffs that allow for fleixible computing of sequences including:
- html
- pure strings (
get_close_matches
) - number type diffs
- has a series of diffs that allow for fleixible computing of sequences including:
OS
- CPU Count:
- https://docs.python.org/3/library/os.html#os.cpu_count
- https://docs.python.org/3/library/os.html#os.sched_getaffinity is not available on all platforms
Discussion Forum Changing from Mailing List
- https://lwn.net/Articles/901744/
- Scattered community upon mailing lists, was scattered between PR’s, Issues, python-dev, python-committers, etc
- Unification on Discourse
Python Concurrency
- https://superfastpython.com/python-concurrency-choose-api/
- Coroutine (asyncio) vs thread (threading) vs process (multiprocessing)