extract text from tex, remove latex tags

https://stackoverflow.com/questions/829408

latex

06-07-2019
|

Question

I have some .tex files from which I want to receive the plain text without any latex tags such as \section{...} or \newpage.
Does anybody have any idea on how to achieve this? I also have the .pdf file but when I just copy the code from there, some words get concatenated which is real bad.
Is there any tool you know?

Solution

detex(1):

Please see the OpenDetex GitHub page for the latest version of OpenDetex. It is a more modern, derivative version of my original DeTeX.

My legacy DeTeX home page is available here.

If you just want the legacy detex-2.8.tar source, you can get it here.

OTHER TIPS

opendetex is available both for windows and Linux

download the program opendetex from here
http://opendetex.googlecode.com/files/opendetex-2.8.1.tar.bz2
http://code.google.com/p/opendetex/downloads/list

Usage: http://code.google.com/p/opendetex/wiki/Usage

extract it to any directory of your choice. Say u extract it to Downloads directory.

make another directory of any name in that (optional. but its good if u create). say the directory name is “my_paper”. Put your paper in the “my_paper” directory. say your paper name is project.tex

Navigate through the path

cd ~/Downloads/opendetex

Run the command

detex -n my_paper/project.tex  > out.txt

generic form

detex -n full_path_to_tex_file.tex > output_text_file.txt

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow