TVTropes Now available in the app store!
Open

Follow TV Tropes

Following

Converting all data to UTF8 (now complete)

Go To

Hey tropers, we are about to undergo a large update to the tvtropes' database to convert the encoding. During this time the login system will be offline. You'll still be able to read TVTropes but you won't be able to post or edit anything during the conversion.

How long will the login system be down?

My most recent test run took 9 hours (my first test took 20+ hours so this is after a lot of optimizations to get down to 9 hours). ***it will now take 11 hours. see update below

Can you give me more details?

TVTropes was originally hosted on Windows' servers back in the day and all content on the site is encoded with Windows 1252 (a superset of ISO-8859-1, aka latin1).

According to the W3Tech, only 1.3% of internet traffic is ISO-8859-1. I'm guessing a large amount of that is TVTropes considering we have millions of pages defined with that charset.

The majority of the internet is encoded in UTF8. By doing this conversion it will bring us up to modern standards with the rest of the web and allow us to more easily support other languages and icons. It will also allow us to use more modern tools to help with editing such adding a WYSIWYG editor option. It will also help with code development as we often have to add special workarounds to continue to support this long deprecated charset.

Are we changing anything else?

While we have the login system offline, we are going to upgrade the edits history database table to include a sequence number. We've always wanted to do this but it could not be done while the site was online. That table has 40M+ rows of data. With this change we'll be able to make it so we can easily jump to any page when filtering to edits from a specific page or from a specific user. We have this for the forum and that's why you can jump to page 500 for example of a long running thread and not have a long delay.

When will this happen?

UPDATE: (take two) This is now scheduled for Mon Dec 5th at 8:30PM PST until Tue Dec 6th at 7:30AM

At 8:30PM PST you will be logged-off of TVTropes so make sure to save anything before that time. Once the conversion is done the next morning, I'll point the code to the new server and everyone should be auto logged in. If not, when you see the all clear announcement at the top of the page, go ahead and try logging in again.

The process will take approximately 11 hours. I had it down to 9 hours until I found examples of utf8 encoded values inside latin1 columns so I had to add some extra testing to ensure that data doesn't get double encoded in the process.

If you have any issues during the migration please send an email to thestaff@tvtropes.org

UPDATE: (complete) The migration is officially complete. It ended up taking nearly 12 hours but we got there. All data on TVTropes (hundreds of millions of rows of data) has now been converted from Latin1 to UTF8. We now have the same encoding as 99% of other websites and can support special characters and other languages. It will also allow us to build other tools such as a WYSIWYG editor.

Edited by itcdr on Dec 6th 2023 at 8:48:40 AM

lalalei2001 Since: Oct, 2009
#176: Dec 6th 2023 at 8:43:55 AM

I like the old source fonts in the history and editing page too (I think that was what was being referenced) but everything else looks great!

Edited by lalalei2001 on Dec 6th 2023 at 11:45:24 AM

The Protomen enhanced my life.
Konnor Naïve Fool from Elsewhere Since: Dec, 2019 Relationship Status: A cockroach, nothing can kill it.
Naïve Fool
#177: Dec 6th 2023 at 8:44:32 AM

when I edit a page on my phone the text on the page I’m editing looks weird, and different. Is that how it’s supposed to be?

Don't ask about my profile picture. I don't know either.
AuraXtreme Since: Mar, 2013
#178: Dec 6th 2023 at 8:44:38 AM

I'm going to have to concur with the request for bringing back/providing an option for the coding font when editing pages. It's much easier to keep track of the markup for italicization and boldfacing with a fixed-width font.

Dirtyblue929 Since: Dec, 2012 Relationship Status: [TOP SECRET]
#179: Dec 6th 2023 at 8:44:44 AM

For those saying it looks the same, that's a good thing. If you check the charset in the headers you'll see it's UTF8 instead of ISO-8859-1. 99% of websites use UTF8 so we were way behind. It shouldn't be noticeable unless you start editing with special characters or in other languages. You'll notice when you try to add special characters in the wiki it won't automatically convert to the html entity as we can now support the actual character.

Sorry if I'm misunderstanding, but is this to say that the use of the font for viewing pages and forum posts being used when editing a page and viewing its history is intended? Because it feels like a bug considering that the box I'm writing this very post in still uses a different, fixed-width font like before.

That said, I do appreciate all the work put into this migration, congratulations!

Edited by Dirtyblue929 on Dec 6th 2023 at 8:46:32 AM

skewview Since: Jun, 2013
#180: Dec 6th 2023 at 8:45:02 AM

~itcdr Yes, congratulations on the success of the effort!

Edited by skewview on Dec 6th 2023 at 4:47:03 PM

AFK with issues, will return
Redmess Redmess from Netherlands Since: Feb, 2014
Redmess
#181: Dec 6th 2023 at 8:47:22 AM

Yeah, for example, that Word quote I just did would just throw up code for those quotation marks under the old char set.

And yeah, the edit box font is different, but I think it's a good thing, because now you can more clearly see what the actual page is going to look like. A nice next step would be for the edit box to auto-replace wiki links and such, so that it shows you whether it works or not right away.

Edited by Redmess on Dec 6th 2023 at 5:50:12 PM

Hope shines brightest in the darkest times
OmegaPC777 He do be excited, bro. (He/him) from Maplewood, MN Since: Apr, 2021 Relationship Status: They're my lobster
He do be excited, bro. (He/him)
#182: Dec 6th 2023 at 8:47:45 AM

Well, what do you know? Everyone and everything's back!

Edited by OmegaPC777 on Dec 6th 2023 at 12:22:35 PM

An excited Wally Walrus for everyone! (Check out my troper wall if you can!)
Eggy0 Wizards Don't Own Cats (she/her) (Holding A Herring) Relationship Status: Faithful to 2D
Wizards Don't Own Cats (she/her)
#183: Dec 6th 2023 at 8:48:48 AM

Yay! Happy to see everything went well this time!

Hylarn (Don’t ask)
#184: Dec 6th 2023 at 8:49:08 AM

So when does the banner go away?

lalalei2001 Since: Oct, 2009
#185: Dec 6th 2023 at 8:50:00 AM

I third the desire for the old posting font back.

The Protomen enhanced my life.
skewview Since: Jun, 2013
#186: Dec 6th 2023 at 8:50:22 AM

It is possible that the monospace font used for the source/editor, does not cover all the UTF-8 characters, but it is a minor inconvenience until one can be set up.

AFK with issues, will return
Amonimus the "Retromancer" from <<|Wiki Talk|>> (Sergeant) Relationship Status: In another castle
the "Retromancer"
#187: Dec 6th 2023 at 8:51:27 AM

As a reminder, currently on pages all foregin language is saved in bytecode. The update's idea is that if you save foreign text in articles, it should now retain next time (on live page both look exactly the same). So feel free to replace bytecode on articles with foreign text.

And also forum posts should stop looking glitchy in previews.

Also yeah, literally any font but the standard one would be cool, maybe monospace can be applied to English only for now?

Edited by Amonimus on Dec 6th 2023 at 7:52:11 PM

TroperWall / WikiMagic Cleanup
Redmess Redmess from Netherlands Since: Feb, 2014
Redmess
#188: Dec 6th 2023 at 8:53:30 AM

What's different, for example, is that you can now directly quote something in a different char set, say in Japanese. It's very convenient for a wiki where we regularly talk about manga and anime, for one thing.

Hope shines brightest in the darkest times
dArtagnanMusic Since: Oct, 2022
#189: Dec 6th 2023 at 8:53:49 AM

something about the new font on edit/history pages bother me, but i can't put my figner on what
oh well, i'll get used to it

fillerdude Since: Jul, 2010
#190: Dec 6th 2023 at 8:54:20 AM

Happy to be back! Thanks for the upgrades!

Konnor Naïve Fool from Elsewhere Since: Dec, 2019 Relationship Status: A cockroach, nothing can kill it.
Naïve Fool
#191: Dec 6th 2023 at 8:56:02 AM

Cool so I guess I just won’t edit anything until it goes back to looking how it used to because it’s kinda wigging me out

Don't ask about my profile picture. I don't know either.
ScrewySqrl Since: Jan, 2001 Relationship Status: YOU'RE TEARING ME APART LISA
#192: Dec 6th 2023 at 8:56:13 AM

I tried to edit a page but had no cursor. I have it posting here though.

Edited by ScrewySqrl on Dec 6th 2023 at 11:56:27 AM

kory MOD Admin (Life not ruined yet) Relationship Status: watch?v=dQw4w9WgXcQ
Admin
#193: Dec 6th 2023 at 8:58:58 AM

I’ll work on adding the font option change later today [tup]

Now Monitoring Query Bugs and Query Wishlist
AuraXtreme Since: Mar, 2013
Floater From dawn to dusk from Song of the Red Cardinal (Plucky Ensign) Relationship Status: Abstaining
From dawn to dusk
#195: Dec 6th 2023 at 9:07:19 AM

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

Heck yeah! cool

A hopeless mess in a kernel of truth.
skewview Since: Jun, 2013
#196: Dec 6th 2023 at 9:10:46 AM

Kory, please see here for some help with fonts with broad unicode coverage.

AFK with issues, will return
Spinosegnosaurus77 Ramen Fairy from Ontario, Canada Since: May, 2011 Relationship Status: All I Want for Christmas is a Girlfriend
Ramen Fairy
#197: Dec 6th 2023 at 9:12:24 AM

Double apostrophes (for italics markup) & quotation marks are hard to tell apart with the new font.

Peace is the only battle worth waging.
MewLettuceRush "The wilderness chose" from E (4 Score & 7 Years Ago)
"The wilderness chose"
#198: Dec 6th 2023 at 9:13:16 AM

Is UTF 8 a programming language or a database? I know nothing about coding, so I was confused.

Cheeky but charming.
lalalei2001 Since: Oct, 2009
#199: Dec 6th 2023 at 9:15:17 AM

Thanks, Kory! :)

The Protomen enhanced my life.
Ian07 Since: Jul, 2022
#200: Dec 6th 2023 at 9:16:12 AM

[up][up] It's a format for encoding text.

Edited by Ian07 on Dec 6th 2023 at 12:16:19 PM


Total posts: 331
Top