Skip to content

rbudhu/python-docx-reader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

python-docx-reader

A simple Microsoft Word .docx reader for Python.

Parses paragraphs, graphics, and inline equations (to tex)

Requirements

  • python 2.7
  • lxml>=3.4.1

Installation

python setup.py install

Usage

from docx.document import Document
doc = Document('path/to/your/docx/file')
# or doc = Document('path/to/your/docx/file', graphics=True, equations=True)

# Get generator of all paragraphs
paragraphs = doc.paragraphs
# Iterate over paragraphs and print paragraph text, graphics, and equations
for paragraph in paragraphs:
    print(paragraph.text)
    print(paragraph.graphics)
    print(paragraph.equations)
# Get all of the text, graphics, and equations in the document
print(doc.text)
print(doc.graphics)
print(doc.equations)

About

A snazzy .docx reader for Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published