Consider the following article
https://arxiv.org/pdf/2101.05907.pdf
It's a typically formatted academic paper with only two pictures in pdf file.
The following code was used to extract the text and equation from the paper
#Related code explanation: https://stackoverflow.com/questions/45470964/python-extracting-text-from-webpage-pdf
import io
import requests
r = requests.get(url)
f = io.BytesIO(r.content)
#Related code explanation: https://stackoverflow.com/questions/45795089/how-can-i-read-pdf-in-python
import PyPDF2
fileReader = PyPDF2.PdfFileReader(f)
#Related code explanation: https://automatetheboringstuff.com/chapter13/
print(fileReader.getPage(0).extractText())
However, the result was not quite correct
Bohmpotentialforthetimedependentharmonicoscillator
FranciscoSoto-Eguibar
1
,FelipeA.Asenjo
2
,SergioA.Hojman
3
andH
´
ectorM.
Moya-Cessa
1
1
InstitutoNacionaldeAstrof´
´
OpticayElectr´onica,CalleLuisEnriqueErroNo.1,SantaMar´Tonanzintla,
Puebla,72840,Mexico.
2
FacultaddeIngenier´yCiencias,UniversidadAdolfoIb´aŸnez,Santiago7491169,Chile.
3
DepartamentodeCiencias,FacultaddeArtesLiberales,UniversidadAdolfoIb´aŸnez,Santiago7491169,Chile.
DepartamentodeF´FacultaddeCiencias,UniversidaddeChile,Santiago7800003,Chile.
CentrodeRecursosEducativosAvanzados,CREA,Santiago7500018,Chile.
Abstract.
IntheMadelung-Bohmapproachtoquantummechanics,weconsidera(timedependent)phasethatdependsquadrati-
callyonpositionandshowthatitleadstoaBohmpotentialthatcorrespondstoatimedependentharmonicoscillator,providedthe
timedependentterminthephaseobeysanErmakovequation.
Introduction
Harmonicoscillatorsarethebuildingblocksinseveralbranchesofphysics,fromclassicalmechanicstoquantum
mechanicalsystems.Inparticular,forquantummechanicalsystems,wavefunctionshavebeenreconstructedasisthe
caseforquantizedincavities[1]andforion-laserinteractions[2].Extensionsfromsingleharmonicoscillators
totimedependentharmonicoscillatorsmaybefoundinshortcutstoadiabaticity[3],quantizedpropagatingin
dielectricmedia[4],Casimire
ect[5]andion-laserinteractions[6],wherethetimedependenceisnecessaryinorder
totraptheion.
Timedependentharmonicoscillatorshavebeenextensivelystudiedandseveralinvariantshavebeenobtained[7,8,9,
10,11].Alsoalgebraicmethodstoobtaintheevolutionoperatorhavebeenshown[12].Theyhavebeensolvedunder
variousscenariossuchastimedependentmass[12,13,14],timedependentfrequency[15,11]andapplicationsof
invariantmethodshavebeenstudiedindi
erentregimes[16].Suchinvariantsmaybeusedtocontrolquantumnoise
[17]andtostudythepropagationoflightinwaveguidearrays[18,19].Harmonicoscillatorsmaybeusedinmore
generalsystemssuchaswaveguidearrays[20,21,22].
Inthiscontribution,weuseanoperatorapproachtosolvetheone-dimensionalSchr
¨
odingerequationintheBohm-
Madelungformalismofquantummechanics.ThisformalismhasbeenusedtosolvetheSchr
¨
odingerequationfor
di
erentsystemsbytakingtheadvantageoftheirnon-vanishingBohmpotentials[23,24,25,26].Alongthiswork,
weshowthatatimedependentharmonicoscillatormaybeobtainedbychoosingapositiondependentquadratictime
dependentphaseandaGaussianamplitudeforthewavefunction.Wesolvetheprobabilityequationbyusingoperator
techniques.Asanexamplewegivearationalfunctionoftimeforthetimedependentfrequencyandshowthatthe
Bohmpotentialhasdi
erentbehaviorforthatfunctionalitybecauseanauxiliaryfunctionneededinthescheme,
namelythefunctionsthatsolvestheErmakovequation,presentstwodi
erentsolutions.
One-dimensionalMadelung-Bohmapproach
ThemainequationinquantummechanicsistheSchrodingerequation,thatinonedimensionandforapotential
V
(
x
;
t
)
iswrittenas(forsimplicity,weset
}
=
1)
i
@ 
(
x
;
t
)
@
t
=
1
2
m
@
2
 
(
x
;
t
)
@
x
2
+
V
(
x
;
t
)
 
(
x
;
t
)
(1)
arXiv:2101.05907v1  [quant-ph]  14 Jan 2021
As shown:
How to fix this and extract text and equations correctly from the pdf file that was generated from latex?
In the mean time, PyPDF2 got deprecated. Use pypdf (I'm the maintainer of both; see migrtion guide).
We don't have anything specific for equations, but text extraction in general:
import io
import requests
from pypdf import PdfReader
# Download content
url = "https://arxiv.org/pdf/2101.05907.pdf"
r = requests.get(url)
f = io.BytesIO(r.content)
# Extract text
reader = PdfReader(f)
print(reader.pages[0].extract_text())
The last paragraph is

and pypdf==3.16.4 gives:
The main equation in quantum mechanics is the Schrodinger equation, that in one dimension and for a potential V(x,t)
is written as (for simplicity, we set ℏ=1)
i∂ψ(x,t)
∂t=−1
2m∂2ψ(x,t)
∂x2+V(x,t)ψ(x,t) (1)
You can see that the text is fine, but all of the math characters / equation structure is not represented well.
Math text extraction will for sure stay suboptimal for a long time, but I've opened a ticket to improve the text extraction (the partial, phi, maybe also the hbar): https://github.com/py-pdf/pypdf/issues/2009
See also: Why text extracting is hard. In summary: pypdf will hopefully get better with extracting the greek letters
Bohm potential for the time dependent harmonic oscillator
Francisco Soto-Eguibar1, Felipe A. Asenjo2, Sergio A. Hojman3and H ´ector M.
Moya-Cessa1
1Instituto Nacional de Astrof´ ısica, ´Optica y Electr´ onica, Calle Luis Enrique Erro No. 1, Santa Mar´ ıa Tonanzintla,
Puebla, 72840, Mexico.
2Facultad de Ingenier´ ıa y Ciencias, Universidad Adolfo Ib´ a˜ nez, Santiago 7491169, Chile.
3Departamento de Ciencias, Facultad de Artes Liberales, Universidad Adolfo Ib´ a˜ nez, Santiago 7491169, Chile.
Departamento de F´ ısica, Facultad de Ciencias, Universidad de Chile, Santiago 7800003, Chile.
Centro de Recursos Educativos Avanzados, CREA, Santiago 7500018, Chile.
Abstract. In the Madelung-Bohm approach to quantum mechanics, we consider a (time dependent) phase that depends quadrati-
cally on position and show that it leads to a Bohm potential that corresponds to a time dependent harmonic oscillator, provided the
time dependent term in the phase obeys an Ermakov equation.
Introduction
Harmonic oscillators are the building blocks in several branches of physics, from classical mechanics to quantum
mechanical systems. In particular, for quantum mechanical systems, wavefunctions have been reconstructed as is the
case for quantized fields in cavities [1] and for ion-laser interactions [2]. Extensions from single harmonic oscillators
to time dependent harmonic oscillators may be found in shortcuts to adiabaticity [3], quantized fields propagating in
dielectric media [4], Casimir e ffect [5] and ion-laser interactions [6], where the time dependence is necessary in order
to trap the ion.
Time dependent harmonic oscillators have been extensively studied and several invariants have been obtained [7, 8, 9,
10, 11]. Also algebraic methods to obtain the evolution operator have been shown [12]. They have been solved under
various scenarios such as time dependent mass [12, 13, 14], time dependent frequency [15, 11] and applications of
invariant methods have been studied in di fferent regimes [16]. Such invariants may be used to control quantum noise
[17] and to study the propagation of light in waveguide arrays [18, 19]. Harmonic oscillators may be used in more
general systems such as waveguide arrays [20, 21, 22].
In this contribution, we use an operator approach to solve the one-dimensional Schr ¨odinger equation in the Bohm-
Madelung formalism of quantum mechanics. This formalism has been used to solve the Schr ¨odinger equation for
different systems by taking the advantage of their non-vanishing Bohm potentials [23, 24, 25, 26]. Along this work,
we show that a time dependent harmonic oscillator may be obtained by choosing a position dependent quadratic time
dependent phase and a Gaussian amplitude for the wavefunction. We solve the probability equation by using operator
techniques. As an example we give a rational function of time for the time dependent frequency and show that the
Bohm potential has di fferent behavior for that functionality because an auxiliary function needed in the scheme,
namely the functions that solves the Ermakov equation, presents two di fferent solutions.
One-dimensional Madelung-Bohm approach
The main equation in quantum mechanics is the Schrodinger equation, that in one dimension and for a potential V(x,t)
is written as (for simplicity, we set ℏ=1)
i∂ψ(x,t)
∂t=−1
2m∂2ψ(x,t)
∂x2+V(x,t)ψ(x,t) (1)arXiv:2101.05907v1  [quant-ph]  14 Jan 2021
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With