Hey tropers, we are about to undergo a large update to the tvtropes' database to convert the encoding. During this time the login system will be offline. You'll still be able to read TVTropes but you won't be able to post or edit anything during the conversion.
How long will the login system be down?
My most recent test run took 9 hours (my first test took 20+ hours so this is after a lot of optimizations to get down to 9 hours). ***it will now take 11 hours. see update below
Can you give me more details?
TVTropes was originally hosted on Windows' servers back in the day and all content on the site is encoded with Windows 1252 (a superset of ISO-8859-1, aka latin1).
According to the W3Tech, only 1.3% of internet traffic is ISO-8859-1. I'm guessing a large amount of that is TVTropes considering we have millions of pages defined with that charset.
The majority of the internet is encoded in UTF8. By doing this conversion it will bring us up to modern standards with the rest of the web and allow us to more easily support other languages and icons. It will also allow us to use more modern tools to help with editing such adding a WYSIWYG editor option. It will also help with code development as we often have to add special workarounds to continue to support this long deprecated charset.
Are we changing anything else?
While we have the login system offline, we are going to upgrade the edits history database table to include a sequence number. We've always wanted to do this but it could not be done while the site was online. That table has 40M+ rows of data. With this change we'll be able to make it so we can easily jump to any page when filtering to edits from a specific page or from a specific user. We have this for the forum and that's why you can jump to page 500 for example of a long running thread and not have a long delay.
When will this happen?
UPDATE: (take two) This is now scheduled for Mon Dec 5th at 8:30PM PST until Tue Dec 6th at 7:30AM
At 8:30PM PST you will be logged-off of TVTropes so make sure to save anything before that time. Once the conversion is done the next morning, I'll point the code to the new server and everyone should be auto logged in. If not, when you see the all clear announcement at the top of the page, go ahead and try logging in again.
The process will take approximately 11 hours. I had it down to 9 hours until I found examples of utf8 encoded values inside latin1 columns so I had to add some extra testing to ensure that data doesn't get double encoded in the process.
If you have any issues during the migration please send an email to thestaff@tvtropes.org
UPDATE: (complete) The migration is officially complete. It ended up taking nearly 12 hours but we got there. All data on TVTropes (hundreds of millions of rows of data) has now been converted from Latin1 to UTF8. We now have the same encoding as 99% of other websites and can support special characters and other languages. It will also allow us to build other tools such as a WYSIWYG editor.
Edited by itcdr on Dec 6th 2023 at 8:48:40 AM
So this is going to fix the various bugs surrounding accented characters?
Thanks, chief.
TroperWall / WikiMagic Cleanup@Zuxtron, yes it should make it so we can easily work with accented characters, characters in many other languages and even things like iphone emoji's if we wanted to support those. In most of areas of the site we try to convert things like accented characters to html entities (ie: é\;) due to the issues with them in the old Windows encoding. After this migration we won't have to and you'll be able to just type the accented character from your keyboard and have it stay like that.
Oh Yes! Thank you very much!
Please note, I didn't want to annoy anyone with the wishlist/bug-report on this issue, so if I did, I'm sorry!
AFK with issues, will returnJust want to clarify: after this update, if I go to a troper's edit history and click on an edit, will it take me right to the edit they made versus just the trope's/work's history page?
Because if so, that would be a very helpful feature, especially for pages that accrue a lot of edits.
Kaito is an alien and he is kinda spacey, coming from the universe to party and go crazy!So to clarify will our accounts like not be able to post during this time?
"That's right mortal. By channeling my divine rage into power, I have forged a new instrument in which to destroy you."As stated in that first post, no posting or editing will be possible and we won't be able to log in.
Good to know.
ilovewildkratts1, i just added that to the list of Upcoming Editing Improvements.
Now monitoring Wishlist and BugsDidn't know this site wasn't already in UTF8. This is good news!
Edited by JHD0919 on Dec 1st 2023 at 4:57:06 AM
This is Idol Tap. (My Troper Wall)This is really good news, indeed. I'm also looking forward to the data conversion.
Edited by gjjones on Dec 1st 2023 at 5:09:30 AM
He/His/Him. No matter who you are, always Be Yourself.Should we expect any special complications logging in afterwards?
The Revolution Will Not Be TropeableDoes this have a chance of breaking any current URLs once complete?
Eating a Vanilluxe will give you frostbite.@rmctagg09, I am copying all data to a new server to do this conversion for safety. So if anything happens we can always switch back to the old server. If everything goes correctly it should convert all characters properly and no one should notice any differences when reading the site.
And all current wikiwords are forced to latin anyway, so I don't think it can affect existing links.
TroperWall / WikiMagic CleanupGood to know, thanks!
he/him[nervous sounds]
New theme music also a boxI rather wanna make sure and I don't wanna cause any troubles, but you see, last year I somehow started working on the russian translation of TV Tropes right on the site, and i guess we all know that translations here are somewhat problematic. It seems like your upgrade will make things easier. What can i, as a user of Cyrillic alphabet, expect? For example, will pages' UR Ls still be in english? Or when i edit pages in russian almost all symbols translate into HTML code (well, honestly, not a really big problem thanks to Page Source), so will it be fixed too? Will there be any changes to namespaces? And does this all applies to other languages as well (if my memory serves me right, i remember that chinese and japanese had troubles too and unlike me they could not even start the translation)?
So since the update will fix accented characters, would this include accented characters in links?
Take this link to the Wikipedia article on Marie-Thérèse Charlotte for example. Unless you replace the accented characters with non accented characters and hope the link still works afterwards (it's not guaranteed), the link won't work because of the accented characters.
Victor of HGS S320 | "There's rosemary, that's for remembrance. Pray you, love, remember."x12: Ah, good to know. Thank you!
Kaito is an alien and he is kinda spacey, coming from the universe to party and go crazy!Agreed that that should be fixed. Why should fairly basic diacritics have to break links?
Edited by Lymantria on Dec 1st 2023 at 3:09:38 PM
Join the Five-Man Band cleanup project!"Server updates/tests" really explains what happened to my account over the past few months. For one, I had to change my password because I could've sworn I got hacked back in the summer; for another, my password reset itself halfway through October (which was upsetting, to say the least).
I wish you godspeed and good luck, but I'm still a little worried.
Now everyone pat me on the back and tell me how clever I am!~Cutegirl920fire, That is the beauty and advantage of UTF8 - a special character with accent, is treated as if without, say: á = a, a search engine, and even the browser will treat the former as if it is the latter; but unfortunately this is not the case when the character is simply á or %E1 or %C3%A1. Some browsers just ignore the characters completely. You can test this out: here or go on that page from wikipedia, open your in-browser search dialogue and enter the characters in the name as if without accents.
I'm fairly confident that the special HTML URL character encoding will similarity be of no great concern. Please note however that it is a different situation — that will rely on the browser to encode the characters into the ASCII table to actually request the linked page — though even with that, there are few sites left online that maintain that primary encoding for content.
edit: some clarification
Edited by skewview on Dec 1st 2023 at 5:22:35 PM
AFK with issues, will return
Hey tropers, we are about to undergo a large update to the tvtropes' database to convert the encoding. During this time the login system will be offline. You'll still be able to read TVTropes but you won't be able to post or edit anything during the conversion.
How long will the login system be down?
My most recent test run took 9 hours (my first test took 20+ hours so this is after a lot of optimizations to get down to 9 hours). ***it will now take 11 hours. see update below
Can you give me more details?
TVTropes was originally hosted on Windows' servers back in the day and all content on the site is encoded with Windows 1252 (a superset of ISO-8859-1, aka latin1).
According to the W3Tech, only 1.3% of internet traffic is ISO-8859-1. I'm guessing a large amount of that is TVTropes considering we have millions of pages defined with that charset.
The majority of the internet is encoded in UTF8. By doing this conversion it will bring us up to modern standards with the rest of the web and allow us to more easily support other languages and icons. It will also allow us to use more modern tools to help with editing such adding a WYSIWYG editor option. It will also help with code development as we often have to add special workarounds to continue to support this long deprecated charset.
Are we changing anything else?
While we have the login system offline, we are going to upgrade the edits history database table to include a sequence number. We've always wanted to do this but it could not be done while the site was online. That table has 40M+ rows of data. With this change we'll be able to make it so we can easily jump to any page when filtering to edits from a specific page or from a specific user. We have this for the forum and that's why you can jump to page 500 for example of a long running thread and not have a long delay.
When will this happen?
UPDATE: (take two) This is now scheduled for Mon Dec 5th at 8:30PM PST until Tue Dec 6th at 7:30AM
At 8:30PM PST you will be logged-off of TVTropes so make sure to save anything before that time. Once the conversion is done the next morning, I'll point the code to the new server and everyone should be auto logged in. If not, when you see the all clear announcement at the top of the page, go ahead and try logging in again.
The process will take approximately 11 hours. I had it down to 9 hours until I found examples of utf8 encoded values inside latin1 columns so I had to add some extra testing to ensure that data doesn't get double encoded in the process.
If you have any issues during the migration please send an email to thestaff@tvtropes.org
UPDATE: (complete) The migration is officially complete. It ended up taking nearly 12 hours but we got there. All data on TVTropes (hundreds of millions of rows of data) has now been converted from Latin1 to UTF8. We now have the same encoding as 99% of other websites and can support special characters and other languages. It will also allow us to build other tools such as a WYSIWYG editor.
Edited by itcdr on Dec 6th 2023 at 8:48:40 AM