What is HTML?

HTML is a language derived from XML to describe web pages. HTML stands for “Hyper Text Markup language”. Before we start learning HTML in detail let us first understand: Markup Language, XML, etc…

What is Markup Language?

A markup language is a system to describe data (information) in a way that is distinguishable from the language syntax. For example we would like to describe/store student records in a file named: student_database.txt then how we can write it? And how a computer program can read the generated file later on as and when required?
Let us decide various parameters for the same.

  • We would like to store student id, name, contact no, email, gender in the file.
  • Each column is separated by space.
  • Each record is separated by new line character.

By considering above constrains in mind if we design a sample student file then it will look like:

1 Ankit 9925720005 ankit@ankit.co male
2 Sonali 9925720005 sonali@ankit.co female
3 Mansukh 9925720005 mansukh@ankit.co male

And here we go. Our file format is ready! But the problem comes when we have space character in data. E.g. name of the student is not “Ankit”. I would like to store name as “Ankit Virparia” then the 1st line of the file will become

1 Ankit Virparia 9925720005 ankit@ankit.co male
2 Sonali 9925720005 sonali@ankit.co female
3 Mansukh 9925720005 mansukh@ankit.co male

But this will not be detected by our stupid computer program! And it will consider “Virparia” as phone number and “9925720005” as email which is absolutely wrong! What’s the solution? Let’s try one more approach and use “$” as a separator instead of space.

1$Ankit Virparia$9925720005$ankit@ankit.co$male
2$Sonali$9925720005$sonali@ankit.co$female
3$Mansukh$9925720005$mansukh@ankit.co$male

By considering $ as a separator our program can read the information correctly. But again what if we want to store $ as a data? Problem… Problem… Problem…

The ultimate solution is standardizing a language which is known to everybody! That is what we call as Markup Language. XML is a most popular markup language.

What is XML?

XML: Extensible Markup Language is a markup language which defines a set of rules to format information in a file that is interpretable by human as well as machine.

XML is mainly divided into 2 parts. 1) Markup syntax 2) Information/User Data to be stored

How do we write markup syntax in XML?

To write markup syntax, we use several keywords/characters such as <, >, space, “, ‘, etc. We represent the information like this: <tag>INFORMATION GOES HERE</tag>. Pair of <> blocks are known as tag. XML gives freedom to introduce tag with any name. Developer has to introduce a tag name and write the software which will understand the meaning of the tag written. One more important aspect of tag based language is it can also be nested. For example: student data sheet represented in XML way.

<allstudents>
<student>
<id>1</id>
<name>Ankit Virparia</id>
<contact>9925720005</contact>
<email>ankit@ankit.co</email>
<gender>male</gender>
</student>
<student>
<id>2</id>
<name>Sonali</id>
<contact>9925720005</contact>
<email>sonali@ankit.co</email>
<gender>female</gender>
</student>
<student>
<id>3</id>
<name>Mansukh</id>
<contact>9925720005</contact>
<email>mansukh@ankit.co</email>
<gender>male</gender>
</student>
</allstudents>

So, Here in our example I have introduced my own tags: <allstudents>, <student>, <id>, <name>, etc. And I will write my program/parser/reader in such a way that it can also understand the newly introduced tags. You must have a question on your mind that is: What if we need to store a tag as a data/information?

Answer is very simple. There is a special syntax available in XML by which we can stop the parser by processing nested tags. Syntax is: <![CDATA[ and ]]> We also call it as CDATA section.

For example:
<tag>
<![CDATA[
<not_in_use>test</not_in_use>
]]>
</tag>

Note: A CDATA section cannot contain the string "]]>". Nesting of CDATA sections are not allowed. The "]]>" that marks the end of the CDATA section cannot contain spaces or line breaks.

What is a tag?

Tag is a markup language which starts with “<” and ends with “>”.  There are 3 types of tags

  • start-tags; for example: <section>
  • end-tags; for example: </section>
  • empty tag where there is no data/nesting of tag is possible; for example: <line-break />

What is attribute?

It’s also a markup syntax where by argument in the tag can be passed. It’s in the format of name-value pair which can be written in start tag/empty tag.
Eg.
<tag_name name=”value” name2=”value2” >data</tag_name>
<tag_name name=”value” name2=”value2” / >

Use of XML:

So we can conclude that XML is useful to store user data by introducing custom tags.

Introduction to HTML (Again)?

HTML is a language derived from XML to describe web pages. HTML stands for “Hyper Text Markup language”.
As per our discussion, we can conclude that XML is used to store information using custom tags. And the parser/reader for the same will also be the custom one. But HTML is the language which has fixed set of tags (Custom tags are not allowed) and for which we already have parser/reader programs also known as BROWSER (Mozilla Firefox, Google Chrome, Apple Safari, Opera, Internet Explorer, etc.). So, we as a developer will learn HTML technology which browser already knows (how to parse) and by which we can generate a custom web page.

HTML file extensions

.HTML
.HTM

Initially only 3 characters where supported as a file extension. So “.htm” was used. But now a days “.html” is popular and supported by all modern operating systems.

HTML editors

Notepad, Wordpad, Notepad++, Adobe Dreamweaver, Quonta Plus, Any text editor can be used to write HTML

HTML characteristics

  • HTML is not a case sensitive language
  • HTML is used to generate a view. It is not used to store/transport information
  • It provides the more flexible way to deign web pages along with the text.
  • Links can also be added to the web pages so it help the readers to browse the information of their interest.
  • You can display HTML documents on any platforms such as Macintosh ,Windows and Linux etc.

HTML Versions and Standards

XHTML 1.0 | HTML 4.01 | XHTML basic | Modularization of XHTML | XHTML 1.1 | XML Events

HTML and Browser Supports

Test your browser at: http://html5test.com/