-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Bash from the process of building the site #623
Conversation
ba3c5da
to
a6dab16
Compare
8c554fd
to
f5fcb3e
Compare
Lines 77 to 114 in 988ae48
Can someone explain to me what's happening in this code part? i find it difficult to understand what's happening here. @mik-laj could you help me? |
@Bowrna see: https://stackoverflow.com/a/12498485 This generates a relative path for use in a container. |
Could you tell me why we need to convert the absolute path into relative path to run the script? Wouldn't it run fine with the absolute path? @mik-laj |
Because in container it will be in a different location ("/opt/airlfow/.....") rather than |
Ah I see @potiuk. As the sh file is replaced to run without docker dependency, i forgot about this point. Thanks |
Line 199 in fd7af05
Is this redirect still valid even after removing the docker part @mik-laj @potiuk @turbaszek @rossturk I have this doubt because we are redirecting to the path "/docs/${package_name}/stable/index.html". So I wanted to ensure if this is still valid even after removing the docker part for building site. Thanks |
f5fcb3e
to
8f77430
Compare
8f77430
to
f5d4b69
Compare
Nope. Not needed. I think it was not even needed before for a long time already. This is likely a remnant of a very old approach - long time defunct. I think what is important is that after buildng the site locallly, you can start the server and locally browse the site, also a check that after deploying to airflow site it still works should happen, but this is something we can likely do only after we merge it. |
f5d4b69
to
132dba4
Compare
@potiuk I get the following error when building the site and I am not sure how I can solve this. This occurs when
|
Hmmmm I may be wrong here but if I remember correctly this is one of the errors that occurs when the submodules aren't cloned. Have you run |
Yes after running this the
Am I missing downloading anything here? I checked in CONTRIBUTORS.md guide but didn't find anything relevant to this. |
@Bowrna I think this has to do with the version of bash. If I try this on my M1 Mac, I get an error: bash-3.2$ bash -version
bash -version
GNU bash, version 3.2.57(1)-release (arm64-apple-darwin21)
Copyright (C) 2007 Free Software Foundation, Inc.
bash-3.2$ readarray foo < <(ls -al)
readarray foo < <(ls -al)
ls -al
bash: readarray: command not found But if I try it on my Linux machine, I get this: rturk@caymus:~$ bash --version
GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
rturk@caymus:~$ readarray foo < <(ls -al)
rturk@caymus:~$ echo ${foo[8]}
drwxr-xr-x 2 rturk caymus 4096 Oct 6 2021 .dbt I think perhaps
|
I think it has never been supposed to be run on Mac :). And one of the reasons we want to rewrite everything in Python, is that bash is no longer "common" accross multiple platform. The MacOS Bash is Bash 3 (for licencing reason - Apple refused to use Bash4 due to GPLv3 (https://news.ycombinator.com/item?id=20102597) and a lot of tools there are not really POSIX compliant, so there are plenty of hacks around that (readarray is one of the features of Bash4). The ultiimate goal we have is that Breeze and all our tools can be used WITHOUT having or using bash in the Host. Fully, Once we get there, the next step will be to make sure that it works on Windows. And that should be our goal here as well (even though site.sh is not really used by many people). So - let's just not mention in CONTRIBUTING.md that you need to install bash from brew. I deliberately avoided that necessity, because that it yet another prerequisite that adds friction to the first-time experience. Ideally just |
This has profound consequences. Many tools which "assume" bash is v3 on Mac will stop working if you make Bash 4 default. Even if you try to install it via Going bash-less is the only good route we can take - now when we have Python3.7+ everywhere (previously various options of having Python2 + Python 3 (3.5 as well) created similar set of problems, but since everyone is now on Python 3.7+ (3.6 EOL reached) basing "lowest common denominator" on Python 3.7 is great for such an environment (that's why we have the whole "Rewrite Breeze to Python" project now and not 2 years ago.. |
I'm sorry, what are you suggesting as an alternative? |
Remove Bash. Completely. I think this whole task is about removing bash and replacing it with Python :). Once the script is replaced with Bash, there is no need to mention it in CONTRIBUTING - because ..... there will be no bash to run any more. Or am I missing something :)? |
Yes, you are missing that this PR title is "replace site.sh" and the script we are talking about is |
Ahhhh. I see now 🤦 . Precisely. This was the whole purpose of the task - to replace all the bash usage. It makes no sense only to replace site.sh. The reason why we are doing it is not because "we do not like bash", but because "we want to get rid of bash from the host entirely" - to be able to run it all on MacOS, Linux, Windows or anything else that has "native" Python (and potentially has no bash). So the TITLE of the task is wrong. Should be "Remove Bash from the process of building the site". |
And the root cause of the confusion was that when the original issue was created, there was indeed only one See the description of the issue by @mik-laj - so that's why the title was wrong commented on 9 Dec 2020
|
Ah, I see the whole point now. Currently to see how this |
Trying to understand the below bash command. Could anyone tell me what's happening in this command airflow-site/landing-pages/check-links.sh Lines 59 to 63 in 445586c
When trying to execute the above script I just enabled print and saw it's logging a few html files. But when I took out that command and executed it via terminal it shows a big list of HTML files. I don't understand how this is different from executing in a script. When executing in terminal I see |
It fills the array 'pages' with the list of HTML files found. The problem you see is likely the so if you have
This wil produce "a file.html\0another file.html" Then when you read it with -d '' you will get array: ['a file.html', 'another file.html'] Otherwise you'd get: ['a', 'file.html', 'another', 'file.html'] The problem with differences you see is that the "\0" separated files are not broken over multiple lines because they are really a very long line with no spaces to allow the break happen. |
Could you share your view for this point? This is also another difference in the output when executing it in different ways @potiuk . |
I think it's just the matter of when it is called and whether the files are there when it is executed. Some earlier steps likely prepare the files in dist folder and maybe that step is missing when you run it manually ? Can't think of any other reason. |
I have tried executing both of ways consecutively without running any other step in between. Let me debug this again and come with more information @potiuk |
Hey @Bowrna are you going to continue with that one :) ? Not the highest prirority, so if we do not plan it, maybe better to close that one? |
I will work on this part this weekend @potiuk. As I got stuck on resolving the issue, I started fixing other issues and slowly I became dormant in working on this issue. |
It is common to delete part of a URL to go to the parent location.
then the user can delete part of the URL and go to:
When we do not have this redirect, the user will see a 404 error. In other words, all directories the user has access to should have an index.html file to improve UX. |
132dba4
to
9e4aad6
Compare
Approved to see. |
@Bowrna are you still working on this PR? |
@eladkal I didn't work on this for a long time as I was working on other PR. I will start working on this PR and one another in this weekend as I have some time due to the upcoming holiday week. |
As I am currently working on this PR, I will post the parts that I have completed and parts that are pending to keep on track.
In the case of |
@potiuk @mik-laj I am considering to use beautiful soup library, to extract the links from html files for check-site-links command. Is it fine or is there any better way to handle it? If there is please let me know. Edit: I will check if i can pull external links via regex pattern before trying to use bs4 Thanks. :) |
@potiuk @mik-laj
|
We don't use this check, so I think we can drop it for now. |
@mik-laj Could you tell me if the check-site-links command in site.sh is used or not? Because that command uses this script file and here they have used lynx to extract URL? Could you give me more explanation on that part? |
9e4aad6
to
06b6be5
Compare
Also, can this code be reviewed? I will try fixing this PR and getting it merged to main. |
Hello all :) do I have any update on this PR? |
Hmm, I honestly think we should not invest too much in it. We are discussing about switching to Pelikan which is going to change the process completely anyway. |
There are a number of problems with dependencies the current approach has (and it caused us some problems in the recent past) so I think this is somethign we have to do anyway soon. |
@potiuk Thanks for the update. I am happy to help in switching to Pelikan part :) |
could i close this PR @potiuk as this is not relevant if switching to Pelican? |
yes |
closes: #338