Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform update in parallel #110

Merged
merged 1 commit into from
Nov 10, 2024
Merged

Conversation

rtobar
Copy link
Contributor

@rtobar rtobar commented Nov 6, 2024

The update command implementation runs over files that are independent from each other. As such, the overall update operation can be trivially parallelised to speed things up.

This change introduces The list of files that need to be compared/updated is collected in a first past. This list is then given to a multiprocessing pool to farm out the actual update of each individual file. The amount of parallelism is controlled through a new "jobs" parameter, command line option and environment variable. If no value is given for this option, all CPUs are used.

I noticed this chance for improvement when doing a test run of the update of .po files for the Spanish translation of the CPython documentation. Local numbers in my 8-core, hyper-threaded AMD Ryzen 7 5825U:

-j 1 (same as old behaviour)

real    12m5.402s
user    12m4.942s
sys     0m0.273s

-j 8

real    2m23.609s
user    17m45.201s
sys     0m0.460s

<no value given>

real    1m57.398s
user    26m22.654s
sys     0m0.989s

The update command implementation runs over files that are independent
from each other. As such, the overall update operation can be trivially
parallelised to speed things up.

This change introduces  The list of files that need to be
compared/updated is collected in a first past. This list is then given
to a multiprocessing pool to farm out the actual update of each
individual file. The amount of parallelism is controlled through a new
"jobs" parameter, command line option and environment variable. If no
value is given for this option, all CPUs are used.

I noticed this chance for improvement when doing a test run of the
update of .po files for the Spanish translation of the CPython
documentation. Local numbers in my 8-core, hyper-threaded AMD Ryzen 7
5825U:

-j 1 (same as old behaviour)

real    12m5.402s
user    12m4.942s
sys     0m0.273s

-j 8

real    2m23.609s
user    17m45.201s
sys     0m0.460s

<no value given>

real    1m57.398s
user    26m22.654s
sys     0m0.989s

Signed-off-by: Rodrigo Tobar <[email protected]>
Copy link
Member

@shimizukawa shimizukawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your wonderful improvement suggestions.

Unfortunately, I wasn't able to check the effect of the sphinx-doc I used to check the speed, as there were not many documents. However, it seems that there is no bad effect on the operation.
I will merge it.

@shimizukawa shimizukawa self-assigned this Nov 10, 2024
@shimizukawa shimizukawa merged commit ee2537a into sphinx-doc:master Nov 10, 2024
8 checks passed
@rtobar rtobar deleted the parallel-update branch November 10, 2024 04:17
@rtobar
Copy link
Contributor Author

rtobar commented Nov 10, 2024

Thanks @shimizukawa for reviewing and merging. I wonder whether you'd be inclined to make a release to PyPI with these changes at some point? No rush though, we can always install the package from this git repository in the meanwhile.

@shimizukawa
Copy link
Member

I'm going to drop py38 and release as soon as possible.

@shimizukawa
Copy link
Member

2.3.0 has been shipped ;)
https://pypi.org/project/sphinx-intl/2.3.0/

@rtobar
Copy link
Contributor Author

rtobar commented Nov 10, 2024

@shimizukawa thank you very much, this is very helpful 😄

rtobar added a commit to python/python-docs-es that referenced this pull request Nov 15, 2024
Éstos son pequeños cambios que mejoran ligeramente el proceso de la
construcción de la documentación, y hacen más mantenible el código a
futuro.

Primero, la lista de rutas relativas que hay que arreglar en los .rst de
cpython ha sido simplificada, removiendo entradas innecesarias, y
actualizando sólo los archivos que haga falta (en vez de ejecturas cada
actualización sobre todos los archivos cada vez).

Segundo, el target `build` del Makefile fue separado en sus sub-partes
constituyentes, de tal modo que ahora en el step de CI donde antes
teníamos una copia de los comandos `sed` ahora hay sólo una invocación a
`make fix_relative_paths`.

Finalmente, el PR que envié a `sphinx-intl` para realizar updates en
paralelo [ya está
aceptado](sphinx-doc/sphinx-intl#110) y una
nueva versión ya fue publicada, por lo que la lista de requisitos ahora
está actualizada para usar esa última versión (y así hacer más rápido el
proceso de actualización a 3.13).

---------

Signed-off-by: Rodrigo Tobar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants