HTML5 Showcase

I’ve started a new project I call the html showcase.

view it at http://rawgit.com/xpika/html-showcase/master/index.html

github at http://github.com/xpika/html-showcase

The idea for this is to have page featuring as many html elements and styles on the one page. This will be useful for beginners and also for pros who want speed dial documentation to that rare html tag or style. This should also work as a kind of caniuse test with a real result to show what a html widget or style looks like in said browser.

Happy to accept pull requests.

Making the Roman alphabet out of Chinese characters

Some unicode characters kind of look similar to characters latin alphabet without being in that range to begin with. I was inspired by the text “乇乂ㄒ尺卂 ㄒ卄丨匚匚” (extra thicc) I saw on reddit. Not sure of the exact origin. I wanted to find characters for all the letters in the alphabet that look like Chinese characters. Unfortunately, some weren’t so easy to find. For instance, I couldn’t find something that looks much like the letter “v”. After finding lookalikes for all the alphabet, I went and created a tool at

http://chinesed-roman.xy30.com

I’m not sure on the name. Maybe something like kanjlatin or chicode would be better.

The full listing of the alphabet reads:

卂日匚勺乇于彑廾工长辶爪力口尸中尺丂七凵立山㐅丫之.

some examples:

廾乇辶辶口 山口尺辶勺

廾口山 卂尺乇 丫口凵?

七山口 尸辶凵丂 十山口 工丂 于口凵尺

I think it works fairly well. It will probably be a little cryptic to read but isn’t that half the fun of it?

Why the headphone jack should go at the top.

When pocket radios came out they always had the headphones somewhere near the top. At some stage some designers wanted to be different and have it at the bottom.

One can make various reasons pro top or pro bottom.

Pro Top:

  • When holding the phone upright, the jack will be closer to upper objects such as your head which will give the cable better reach ability for the same length.
  • When holding the phone upright, you can more easily seethe jack when you are .

Pro Bottom:

  • You can put the phone in your lower pocket more easily.
  • When using a phone stand the cable can be more easily tucked away.

There is another more complex reason why I like theaudio jack to be at the top. Phones can be very slippery. With the budsin your ear and the phone in your pocket, you can easily have it catch onto something. Your phone can fly into the air and easily break apon landing.

By making the phone easy to shift in your pocket it is now easy to slip out. By having the jack at the top although it’s harder stuff it in your pocket, it’s also less likely to fall out.

When the jack is at the bottom of your pocket the phone literally has to rotate all around before it has any energy to fly out. By this stage it’s much more likely that your ear buds will pull out instead and end up hanging somewhere above your feet.

 

Why you should put doctype html at the top of your document

If you create the following html document:

<html>
   <head>
      <style>
         body {
         font-size:40px;
         }
      </style>
   </head>
   <body>
      hello
      <table>
         <tr>
            <td>
               hello
            </td>
         </tr>
      </table>
   </body>
</html>

It will render like this.

screen-shot-2016-12-15-at-7-51-35-pm

If you add <!doctype html> to the top of your document like this:

<!doctype html>
<html>
   <head>
      <style>
         body {
         font-size:40px;
         }
      </style>
   </head>
   <body>
      hello
      <table>
         <tr>
            <td>
               hello
            </td>
         </tr>
      </table>
   </body>
</html>

It will render like this.

screen-shot-2016-12-15-at-7-56-31-pm

What happens is that browsers such as firefox will think that you are viewing an old website so they will try and render websites how they used to be rendered. This old way of rendering is called “Quirks mode”. Adding <!doctype html> signifies that you are writing a modern webpage so the browser does not render using quirks mode.

A taste of tag soup

TagSoup is a library to parse non compliant HTML.

To explain why you might want this, lets start by considering the following table.

<table id="test" border="1">
<tr>
<td>test
</tr>
</table>

Notice the missing closing td tag.

This still, in a browser renders as a table, border and all.

table
Though I don’t have the actual stats, judging by how often I may mistakes in my own HTML, I think this I a valid enough reason to suspect the html you try to parse may not be compliant.

The problem is, a strict XML parser would not parse this. What we need is a parser which is lenient enough to parse malformed HTML with some degree of usefulness. Remote controlling a browser session would be a novel solution but it does incur a few overheads making it slower, harder to build and harder to code. Let’s see how far we can get with some common xml parsers.

For example, using the popular HaXml the following error is produced when trying to parse the following source document.

Prelude Text.XML.HaXml.Parse>  xmlParse "" "<b>hello world</i></b>"                                                                                                                          
*** Exception: in element tag b,                                                                                                                                                                                   
tag <b> terminated by </i>                                                                                                                                                                                         
  at file   at line 1 col 15  

This library is way to strict to parse bad html. Let’s try another. This time we will try the package known on hackage as simply xml

Prelude Text.XML.Light> parseXML "<b>hello world</i></b>"
[Elem                                                                                                                                                                                                              
   (Element{elName =                                                                                                                                                                                               
              QName{qName = "b", qURI = Nothing, qPrefix = Nothing},                                                                                                                                               
            elAttribs = [],                                                                                                                                                                                        
            elContent =                                                                                                                                                                                            
              [Text                                                                                                                                                                                                
                 (CData{cdVerbatim = CDataText, cdData = "hello world</i>",                                                                                                                                        
                        cdLine = Just 1})],                                                                                                                                                                        
            elLine = Just 1})]     

Slightly better, but as you can see the closing i tag is counted as text just like hello.

Now, this time, we’ll try using TagSoup.

What is Tagsoup? Tagsoup is a library for parsing and re-rendering html.

To get it,

cabal install tagsoup

After having done that. You can do some scraping. But first you might want to do some reading on the library by doing an online search for “tag soup haskell”.

Also, if you do not know what “tag soup” is you might want to read up the page on wikipedia.

With that out of the way,

First import tagsoup

import Text.HTML.TagSoup

Now we can try and parse the broken HTML again.

Prelude Text.HTML.TagSoup> parseTags "<b>hello world</i></b>"
[TagOpen "b" [],TagText "hello world",TagClose "i",TagClose "b"]

…Now we’re getting somewhere.

As a small demonstration of the library, let’s extract just the text from the html document.

Prelude Text.HTML.TagSoup> concatMap fromTagText $ filter isTagText $ parseTags "<b>hello world</i></b>"
"hello world"

So here you see a library which is capable of handling broken html documents in a fairly more malleable way than the usual xml library.