Perform update in parallel #110

rtobar · 2024-11-06T10:36:44Z

The update command implementation runs over files that are independent from each other. As such, the overall update operation can be trivially parallelised to speed things up.

This change introduces The list of files that need to be compared/updated is collected in a first past. This list is then given to a multiprocessing pool to farm out the actual update of each individual file. The amount of parallelism is controlled through a new "jobs" parameter, command line option and environment variable. If no value is given for this option, all CPUs are used.

I noticed this chance for improvement when doing a test run of the update of .po files for the Spanish translation of the CPython documentation. Local numbers in my 8-core, hyper-threaded AMD Ryzen 7 5825U:

-j 1 (same as old behaviour)

real    12m5.402s
user    12m4.942s
sys     0m0.273s

-j 8

real    2m23.609s
user    17m45.201s
sys     0m0.460s

<no value given>

real    1m57.398s
user    26m22.654s
sys     0m0.989s

The update command implementation runs over files that are independent from each other. As such, the overall update operation can be trivially parallelised to speed things up. This change introduces The list of files that need to be compared/updated is collected in a first past. This list is then given to a multiprocessing pool to farm out the actual update of each individual file. The amount of parallelism is controlled through a new "jobs" parameter, command line option and environment variable. If no value is given for this option, all CPUs are used. I noticed this chance for improvement when doing a test run of the update of .po files for the Spanish translation of the CPython documentation. Local numbers in my 8-core, hyper-threaded AMD Ryzen 7 5825U: -j 1 (same as old behaviour) real 12m5.402s user 12m4.942s sys 0m0.273s -j 8 real 2m23.609s user 17m45.201s sys 0m0.460s <no value given> real 1m57.398s user 26m22.654s sys 0m0.989s Signed-off-by: Rodrigo Tobar <[email protected]>

shimizukawa

Thank you for your wonderful improvement suggestions.

Unfortunately, I wasn't able to check the effect of the sphinx-doc I used to check the speed, as there were not many documents. However, it seems that there is no bad effect on the operation.
I will merge it.

rtobar · 2024-11-10T04:19:02Z

Thanks @shimizukawa for reviewing and merging. I wonder whether you'd be inclined to make a release to PyPI with these changes at some point? No rush though, we can always install the package from this git repository in the meanwhile.

shimizukawa · 2024-11-10T06:17:10Z

I'm going to drop py38 and release as soon as possible.

shimizukawa · 2024-11-10T06:51:14Z

2.3.0 has been shipped ;)
https://pypi.org/project/sphinx-intl/2.3.0/

rtobar · 2024-11-10T09:09:46Z

@shimizukawa thank you very much, this is very helpful 😄

Éstos son pequeños cambios que mejoran ligeramente el proceso de la construcción de la documentación, y hacen más mantenible el código a futuro. Primero, la lista de rutas relativas que hay que arreglar en los .rst de cpython ha sido simplificada, removiendo entradas innecesarias, y actualizando sólo los archivos que haga falta (en vez de ejecturas cada actualización sobre todos los archivos cada vez). Segundo, el target `build` del Makefile fue separado en sus sub-partes constituyentes, de tal modo que ahora en el step de CI donde antes teníamos una copia de los comandos `sed` ahora hay sólo una invocación a `make fix_relative_paths`. Finalmente, el PR que envié a `sphinx-intl` para realizar updates en paralelo [ya está aceptado](sphinx-doc/sphinx-intl#110) y una nueva versión ya fue publicada, por lo que la lista de requisitos ahora está actualizada para usar esa última versión (y así hacer más rápido el proceso de actualización a 3.13). --------- Signed-off-by: Rodrigo Tobar <[email protected]>

shimizukawa approved these changes Nov 10, 2024

View reviewed changes

shimizukawa self-assigned this Nov 10, 2024

shimizukawa merged commit ee2537a into sphinx-doc:master Nov 10, 2024
8 checks passed

rtobar deleted the parallel-update branch November 10, 2024 04:17

rtobar mentioned this pull request Nov 13, 2024

Pequeñas mejoras al proceso de construcción python/python-docs-es#2874

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perform update in parallel #110

Perform update in parallel #110

rtobar commented Nov 6, 2024

shimizukawa left a comment

rtobar commented Nov 10, 2024

shimizukawa commented Nov 10, 2024

shimizukawa commented Nov 10, 2024

rtobar commented Nov 10, 2024

Perform update in parallel #110

Perform update in parallel #110

Conversation

rtobar commented Nov 6, 2024

shimizukawa left a comment

Choose a reason for hiding this comment

rtobar commented Nov 10, 2024

shimizukawa commented Nov 10, 2024

shimizukawa commented Nov 10, 2024

rtobar commented Nov 10, 2024