Hi everyone !
I'm in need for some assistance for string manipulation with sed and regex. I tried a whole day to trial & error and look around the web to find a solution however it's way over my capabilities and maybe here are some sed/regex gurus who are willing to give me a helping hand !
With everything I gathered around the web, It seems it's rather a complicated regex and sed substitution, here we go !
What Am I trying to achieve?
I have a lot of markdown guides I want to host on a self-hosted forgejo based git markdown. However the classic markdown links are not the same as one github/forgejo...
Convert the following string:
[Some text](#Header%20Linking%20MARKDOWN.md)
Into
[Some text](#header-linking-markdown.md)
As you can see those are the following requirement:
- Pattern:
[Some text](#link%20to%20header.md)
- Only edit what's between parentheses
- Replace
space (%20)
with-
- Everything as lowercase
- Links are sometimes in nested parentheses
- e.g. (look here
[Some text](#link%20to%20header.md)
)
- e.g. (look here
- Do not change a line that begins with
https
(external links)
While everything is probably a bit complex as a whole the trickiest part is probably the nested parentheses :/
What I tried
The furthest I got was the following:
sed -Ei 's|\(([^\)]+)\)|\L&|g' test3.md #make everything between parentheses lowercase
sed -i '/https/ ! s/%20/-/g' test3.md #change every %20 occurrence to -
These sed/regx substitution are what I put together while roaming the web, but it has a lot a flaws and doesn't work with nested parentheses. Also this would change every %20
occurrence in the file.
The closest solution I found on stackoverflow looks similar but wasn't able to fit to my needs. Actually my lack of regex/sed understanding makes it impossible to adapt to my requirements.
I would appreciate any help even if a change of tool is needed, however I'm more into a learning processes, so a script or CLI alternative is very appreciated :) actually any help is appreciated :D !
Thanks in advance.
First, thanks again for sharing your knowledge with me I really appreciate the time/effort you took to write all of this. I know those are a lot of thank you :/ but I'm really grateful for all of this, this is very valuable information I will keep in my knowledge base. It's really time I learn proper bash/python/Pearl? scripting with all those tools (grep/sed/regex).
Second, YOU MISSED A DAMNED parentheses you fool xD !
mdlinks="$(grep -Po ']\((?!https).*\)' ~/mkdn)"
Took me some time to figured it out with a very non informative errorbashscript.sh: line 8: unexpected EOF while looking for matching "'
but as expected it works !Next to show you my appreciation and not to take everything for granted and being spoon feed for everything, I tried to find a solution myself for something else, I will try to explain the best I can how I solved it.
The part before the hashtag needs to keep it's initial form (it links to the original markdown file). So, because just playing around with Pearl and regex (which doesn't end well doing this blindly without the proper knowledge) I did some simple string manipulation. It's not very elegant but does the trick, thankfully to your well written breakdown.
Yes, not very elegant but It's the best I could do currently :/ However, I still got a YES effect :P
To answer your question:
As you can see in my string manipulation above, the part before the # needs to keep it's original form :) (Sorry wasn't aware of this before working with the original files) I solved it with some string manipulation as shown above.
I'm a bit tired from all this searching/trail&error, tomorrow I will try to wrap everything up and answer your post below :) ! Also, I need to clean up the mess I made in my home directory xD.
Thanks again for your help ! Have a good night/day !