extract text from tex, remove latex tags
-
06-07-2019 - |
Question
I have some .tex files from which I want to receive the plain text without any latex tags such as \section{...} or \newpage.
Does anybody have any idea on how to achieve this?
I also have the .pdf file but when I just copy the code from there, some words get concatenated which is real bad.
Is there any tool you know?
Solution
Please see the OpenDetex GitHub page for the latest version of OpenDetex. It is a more modern, derivative version of my original DeTeX.
My legacy DeTeX home page is available here.
If you just want the legacy detex-2.8.tar source, you can get it here.
OTHER TIPS
opendetex is available both for windows and Linux
download the program opendetex from here
http://opendetex.googlecode.com/files/opendetex-2.8.1.tar.bz2
http://code.google.com/p/opendetex/downloads/list
Usage: http://code.google.com/p/opendetex/wiki/Usage
extract it to any directory of your choice. Say u extract it to Downloads directory.
make another directory of any name in that (optional. but its good if u create). say the directory name is “my_paper”. Put your paper in the “my_paper” directory. say your paper name is project.tex
Navigate through the path
cd ~/Downloads/opendetex
Run the command
detex -n my_paper/project.tex > out.txt
generic form
detex -n full_path_to_tex_file.tex > output_text_file.txt