Ramblings on technology with a dash of social commentary
RSS icon Email icon Home icon
  • Special characters show up as a question mark inside of a black diamond

    Posted on June 6th, 2009 phpguru 43 comments

    Almost every web developer has run into the problem of character sets and character encoding. Joel On Software has the most succinct post on the topic of Unicode.

    Here’s the problem. Your web page has certain characters that cannot be displayed properly. Instead of typographer’s quotes (“curly quotes” instead of foot ' and inch " marks), ‘e acute’ (as in the word résumé), the copyright symbol (©), registered symbol (®), etc., usually copied from a program like Microsoft Word, your webpage renders with the dreaded “black diamond question mark” symbol: �

    Since the earliest days of the web, we’ve been using HTML Entities to create these characters. HTML Entities are escape sequences to represent special characters in your web page markup. For example, the syntax

    ©

    renders as ©, in a webpage. I realize I can simply use these escape codes to get special characters to display correctly, but why? What if I have hundreds of pages of content with curly quotes in them and I just want to be able to render a page without using HTML entities?

    When I develop websites, I run WAMPServer, which uses PHP 5, MySQL 5, and Apache 2 on Windows XP. I’ve been confused by this topic off and on for over 2 years now. And I’m not the only one

    I’ve tried trouble-shooting the character encoding and serving problem from the top down, starting with the web server software on down the line.

    I have edited my Apache httpd.conf file with

    AddDefaultCharset UTF-8

    I have edited my PHP.ini file with

    default_charset = "utf-8"

    httpd-conf-php-ini-utf-8

    … and restarted Apache.

    I have also made sure that MySQL is using UTF-8. This includes both the MySQL Database itself… 

    mysql-database-connection-utf-8 

    …the MySQL connection, the MySQL table, and the MySQL field where my data is stored.

    mysql-field-utf8-general-ci

    As you can see here, I even have gone into Firefox and set it to accept UTF-8 and receive UTF-8. 

    firefox-content-fonts-advanced-default-character-encoding

    Still, I get unrenderable characters. WHY!?

    question-mark-in-diamond-firefox-utf-8

    I’m using Firebug to display the HTML Headers, and I’ve verified this is not a bug in Firefox. I’m seeing the dastardly � character whether I use Firefox 2, Firefox 3, Opera, Safari, or Chrome.

    I’m sure there’s a character encoding guru out there somewhere that can tell me what I’m missing. I know, I know, I can just turn on iso-8859-1 (Windows Latin), anywhere along the chain of encoding, and everything will be fine. And indeed, this is true. It seems almost unfathomable that I’ve checked every possible setting related to the character set of the content type of the page I am trying to serve, and still get � everywhere.

    Still, I thought the whole idea behind the move to UTF-8 was to prevent me from having to worry about all this stuff. I’d love to just happily store pages, create pages and serve pages in UTF-8 so all my characters look like they’re supposed to and I don’t have to escape them at all. Isn’t that the point?

    I’m not convinced that I fixed the issue, but I have found a workaround. I decided to turn off the charset handling in both httpd.conf and php.ini, and added…

    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />

    …to my page template. It works, but I still want to know why it works, or more accurately, why declaring everything UTF-8 doesn’t.

    Update Oct. 2011

    I’ve just discovered yet another issue that is not easy to figure out. It turns out that you can get the dreaded question-mark-in-diamond characters even in UTF-8 encoded files, if the file is written with a BOM (byte-order-mark). We had a PHP application including several files, one of which was encoded with a BOM. The special characters such as ç and õ were showing up fine on one part of the page, and as � on other parts of the page. We removed the BOM on one of the include files with NotePad++ on Windows and everything was fine again.

  • How Much? Who is doing the thinking?

    Posted on June 3rd, 2009 phpguru No comments

    With every potential website project comes Question Number One: How much will it cost?

    Especially in a turbulent economy, everyone is looking for the best deal on any upcoming expense, and any smart business person would agree that it pays to get multiple quotes, pick the best company for the lowest rate, generally.

    Are you looking to invest in adding on new features to your website right now? During a time in America when it’s pretty clear that taxes are going to up (it’s only a question of when and how much), I would be willing to bet that most companies today have a list of important things to do in the next 12 months… and adding new features, sections, content and functionality onto the website, or redesigning the website, or making a new website… is probably not near the top of that list.

    But whether you are considering taking on new internet development projects right now or not is a good topic for an upcoming post. This post is about how website development should be priced and why it is important to consider a question that is overlooked by many on both sides, client and designer/programmer.

    How much should a website cost?

    Once a business decides to take on a new online marketing project, upgrade the company website, or take on a new internet advertising campaign, invariably, one of the first questions that comes up is always related to ROI. What will our increase in revenue be, or our reduction in costs, if we invest $X in doing Y with technology?

    As a system architect and online application developer with 15 years of experience working as a designer, programmer, user interface designer, database developer and interactive multimedia artist, I am uniquely qualified to describe what I see as an irony in how companies approach investments in technology.

    Client A wants to shop rates, find a family friend or neice of a brother-in-law, or post the project on a site like getaforeignerfordirt.bs. You know, a “kid” who “does websites” for “cheap”. The old addage, “You get what you pay for,” comes to mind. In this case, The Client is Doing The Thinking.

    Business says, This is exactly what we want done. Here’s the Photoshop file with the screens and forms. Here’s the database schema. Here’s the API we need built. Here’s the FTP login for staging. We’ll manage our own DNS, MX, and upload all of the content in digital format. We’ve written all the copy, error messages and auto-responder emails. This is all pre-approved and ready to assemble.

    Okay then, you deserve a great rate on your development project, because you provided everything in a ready-to-go format that allows your Web Team to simply implement without having to think too much. (In 15 years of software design, have yet to meet such a client or receive such as perfectly organized project.)

    Client B wants to find the company that provides the best overall value, brings a wealth of ideas and experience to the table, and acts as a trusted marketing partner. Findng an online agency with the background and on-site skillset to take your ideas, expand upon them, provide you with a scope, a timeline, and execute all the design and development required for a successful launch. In this case, The Developer is Doing The Thinking.

    Business says, This is kinda what we’re looking for. We want this feature but not that feature. We’re not sure what it will take, we just know that how we’ve been doing it isn’t working.

    Okay then, you ought to be ready to pony up with a decent budget because I’m going to have to invest my time, experience and brainpower in making your 5-minute idea come out smelling like roses. I’ll need to design interfaces, databases, write PHP code, Javascript validation, input sanitization. I’ll have to work through multiple rounds of changes to match the idea you have in your head about what it is you want to achieve. I’ll have to figure out where to stage it, write most if not all of the content, find or create images and generally pretend for a while that I work for your company and treat your business like it’s my business. (The vast majority of projects I run into are of this flavor.)

    As these two contrived examples illustrate, the cost of a website project depends on who is doing the thinking. In both scenarios, the end result could have been mostly identical, but in one example, the web development company allocated just a few hours of a programmer’s time, and in the other example, it took the resources of the entire company, including project management, design, database development, programming and testing.

    A wise person once said to me, “It’s not about the agency, it’s about the agent.” This is especially true of technology projects. Who you pick for your project can have a profound effect of the results, the timeline, the budget and the overall outcome. Just remember that, generally speaking, you can’t have it both ways. If you want rock-bottom rates, you’d better be prepared to put in the work of providing your web guy with everything he needs to get the job done quickly and easily. If you want your Web Designer to invest significant amounts of time and energy in your project, why not reward them for it? They’re making an investment in your business.

    All too often, Client B has the same budget as Client A, and wonders why it’s so hard to find a good web programmer.