Follow TV Tropes

Following

Converting all data to UTF8 (now complete)

Go To

Hey tropers, we are about to undergo a large update to the tvtropes' database to convert the encoding. During this time the login system will be offline. You'll still be able to read TVTropes but you won't be able to post or edit anything during the conversion.

How long will the login system be down?

My most recent test run took 9 hours (my first test took 20+ hours so this is after a lot of optimizations to get down to 9 hours). ***it will now take 11 hours. see update below

Can you give me more details?

TVTropes was originally hosted on Windows' servers back in the day and all content on the site is encoded with Windows 1252 (a superset of ISO-8859-1, aka latin1).

According to the W3Tech, only 1.3% of internet traffic is ISO-8859-1. I'm guessing a large amount of that is TVTropes considering we have millions of pages defined with that charset.

The majority of the internet is encoded in UTF8. By doing this conversion it will bring us up to modern standards with the rest of the web and allow us to more easily support other languages and icons. It will also allow us to use more modern tools to help with editing such adding a WYSIWYG editor option. It will also help with code development as we often have to add special workarounds to continue to support this long deprecated charset.

Are we changing anything else?

While we have the login system offline, we are going to upgrade the edits history database table to include a sequence number. We've always wanted to do this but it could not be done while the site was online. That table has 40M+ rows of data. With this change we'll be able to make it so we can easily jump to any page when filtering to edits from a specific page or from a specific user. We have this for the forum and that's why you can jump to page 500 for example of a long running thread and not have a long delay.

When will this happen?

UPDATE: (take two) This is now scheduled for Mon Dec 5th at 8:30PM PST until Tue Dec 6th at 7:30AM

At 8:30PM PST you will be logged-off of TVTropes so make sure to save anything before that time. Once the conversion is done the next morning, I'll point the code to the new server and everyone should be auto logged in. If not, when you see the all clear announcement at the top of the page, go ahead and try logging in again.

The process will take approximately 11 hours. I had it down to 9 hours until I found examples of utf8 encoded values inside latin1 columns so I had to add some extra testing to ensure that data doesn't get double encoded in the process.

If you have any issues during the migration please send an email to thestaff@tvtropes.org

UPDATE: (complete) The migration is officially complete. It ended up taking nearly 12 hours but we got there. All data on TVTropes (hundreds of millions of rows of data) has now been converted from Latin1 to UTF8. We now have the same encoding as 99% of other websites and can support special characters and other languages. It will also allow us to build other tools such as a WYSIWYG editor.

Edited by itcdr on Dec 6th 2023 at 8:48:40 AM

SuperFromND Professional Amateur from C:/Program Files Since: Oct, 2021 Relationship Status: watch?v=dQw4w9WgXcQ
Professional Amateur
#326: Dec 23rd 2023 at 3:57:03 AM

Not sure if this is known already, but it appears that the Featured page seems to still assume contents are non-UTF-8. The Featured Trope today has some mojibake in place of what are supposed to be CJK characters: https://static.tvtropes.org/pmwiki/pub/images/tvt_mojibake_on_front.png

Eddy1215 Since: May, 2010 Relationship Status: How YOU doin'?
#327: Dec 23rd 2023 at 2:58:27 PM

Okay, I'd like to mention that I'm not comfortable with the new order of trope edits, with the oldest ones first and the newer ones in back. What was wrong with the previous way they were organized?

A man who admires many forms of fiction.
Amonimus the Retromancer from <<|Wiki Talk|>> (Sergeant) Relationship Status: In another castle
Codae Since: Aug, 2022
#329: Dec 25th 2023 at 7:46:26 AM

Are the first two "changes" in this edit just an automatic result of the update changing the encoding of certain apostrophes and dashes?

bwburke94 Friends forevermore from uǝʌɐǝɥ Since: May, 2014 Relationship Status: RelationshipOutOfBoundsException: 1
Friends forevermore
#330: Dec 25th 2023 at 7:48:26 AM

Yes, it appears so.

I had a dog-themed avatar before it was cool.
SeptimusHeap from Switzerland (Edited uphill both ways) Relationship Status: Mu
#331: Jan 22nd 2024 at 11:29:56 PM

Closing as done. Further issues/requests can be put on Query Bugs and Query Wishlist.

"For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled." - Richard Feynman
Add Post

Total posts: 331
Top