How can I download HTML?
Hello! I just finished the HTML and css course on codecademy and want to start writing it on my own now. I have been trying to figure out how to get html on my computer and based off what I found online I used notepad to write my code, but I don’t like how it doesn’t give those error notifications. I also used sublime text but for some reason it’s not working (probably my code but idk it works with notepad). I’ve done Java and Python before and I’m starting to think getting HTML on my computer probably isn’t the same process as getting java. Do I have to download CSS separately? Ik these are pretty basic questions lol but if someone could help me out with getting html (and maybe CSS 😳) on my computer(windows) I would really appreciate it!
Hit Ctrl+S and save it as an HTML file (not MHTML). Then, in the <head> tag, add a <base href="http://downloaded_site's_address.com"> tag. For this webpage, for example, it would be <base href="http://stackoverflow.com">.
This makes sure that all relative links point back to where they're supposed to instead of to the folder you saved the HTML file in, so all of the resources (CSS, images, JavaScript, etc.) load correctly instead of leaving you with just HTML.
See MDN for more details on the <base> tag.
The HTML, CSS and JavaScript are sent to your computer when you ask for them on a HTTP protocol (for instance, when you enter a url on your browser), therefore, you have these parts and could replicate on your own pc or server. But if the website has a server-side code (databases, some type of authentication, etc), you will not have access to it, and therefore, won't be able to replicate on your own pc/server.
Heya!
So I have a problem. I'd like to download episodes from the official Pokémon TV website, https://watch.pokemon.com Thing is, I want episodes in a specific language, let's say https://watch.pokemon.com/sv-se/#/season?id=pokemon-black-white. Now, I could just look at the HTML code by inspecting the source code and manually copy the URL for each episode, and download the episodes with youtube-dlp. But that is a lot of work, as I'd have to do that for each individual episode, and for each season I want. So what I want is to write a script (let's say, Python). I'd like to grab each link (href) from each <a> element of the class "btn-play" (or id "play_btn_reference").
The problem is that when I try to do this with beautifulsoup and requests, all I get is nothing; if you choose "View page source" in your browser while on the site, you'll see what I mean (the . I read on a StackOverflow post that selenium would help with scraping the HTML as-is, but I couldn't exactly get it to work.
This is what I have tried thus far:
from bs4 import BeautifulSoup from selenium import webdriver browser = webdriver.Chrome() url = "some_page" browser.get(url) html_source = browser.page_source soup = BeautifulSoup(html_source, 'lxml') browser.quit() fname="results.txt" with open(fname, "w", encoding="utf-8") as output: output.write(soup.prettify()) output.close()
Any of you fine chaps know how I could move forward? I'd just prefer to not have to manually download each episode, and though I have a little bit of experience with this, I don't know how to get past this hurdle. Any help is much appreciated!
Edit: poor formatting of the post