What is an “unauthenticated user”?

Every so often we have a client worrying about unauthenticated users. For example, as part of the output of SHOW PROCESSLIST they will see:

+-----+----------------------+--------------------+------+---------+------+-------+------------------+
| Id  | User                 | Host               | db   | Command | Time | State | Info             |
+-----+----------------------+--------------------+------+---------+------+-------+------------------+
| 235 | unauthenticated user | 10.10.2.74:53216   | NULL | Connect | NULL | login | NULL             |
| 236 | unauthenticated user | 10.120.61.10:51721 | NULL | Connect | NULL | login | NULL             |
| 237 | user                 | localhost          | NULL | Query   | 0    | NULL  | show processlist |
+-----+----------------------+--------------------+------+---------+------+-------+------------------+

Who are these unauthenticated users, how do they get there, and why aren’t they authenticated?

The client-server handshake in MySQL is a 4-step process. Those familiar with mysql-proxy already know these steps, as there are four functions that a Lua script in mysql-proxy can override. The process is useful to know for figuring out exactly where a problem is when something breaks.
(more…)

Every so often we have a client worrying about unauthenticated users. For example, as part of the output of SHOW PROCESSLIST they will see:

+-----+----------------------+--------------------+------+---------+------+-------+------------------+
| Id  | User                 | Host               | db   | Command | Time | State | Info             |
+-----+----------------------+--------------------+------+---------+------+-------+------------------+
| 235 | unauthenticated user | 10.10.2.74:53216   | NULL | Connect | NULL | login | NULL             |
| 236 | unauthenticated user | 10.120.61.10:51721 | NULL | Connect | NULL | login | NULL             |
| 237 | user                 | localhost          | NULL | Query   | 0    | NULL  | show processlist |
+-----+----------------------+--------------------+------+---------+------+-------+------------------+

Who are these unauthenticated users, how do they get there, and why aren’t they authenticated?

The client-server handshake in MySQL is a 4-step process. Those familiar with mysql-proxy already know these steps, as there are four functions that a Lua script in mysql-proxy can override. The process is useful to know for figuring out exactly where a problem is when something breaks.

Step 1: Client sends connect request to server. There is no information here (as far as I can tell). However, it does mean that if you try to connect to a host and port of a mysqld server that is not available, you will get

ERROR 2003 (HY000): Can't connect to MySQL server on '[host]' (111)

Step 2: The server assigns a connection and sends back a handshake, which includes the server’s mysqld version, the thread id, the server host and port, the client host and port, and a “scramble buffer” (for salting authentication, I believe).

It is during Step 2 where the connections show up in SHOW PROCESSLIST. They have not been authenticated yet, but they are connected. If there are issues with authentication, connections will be stuck at this stage. Most often stuck connections are due to DNS not resolving properly, which the skip-name-resolve option will help with.

Step 3: Client sends authentication information, including the username, the password (salted and hashed) and default database to use. If the client sends an incorrect packet, or does not send authentication information within connect_timeout seconds, the server considers the connection aborted and increments its Aborted_connects status variable.

Step 4: Server sends back whether the authentication was successful or not. If the authentication was not successful, mysqld increments its Aborted_connects status variable and sends back an error message:

ERROR 1045 (28000): Access denied for user 'user'@'host' (using password: [YES/NO])

Hope this helps!

Video: Who is the Dick on My Site Keynote

I have already blogged about this keynote at http://www.pythian.com/blogs/948/liveblogging-who-is-the-dick-on-my-site.

If you are interested in actually seeing the video, the 286 Mb .wmv file can be downloaded at http://technocation.org/videos/original/mysqlconf2008/2008_04_17_panelDick.wmv and played through your browser by clicking the “play” link at http://tinyurl.com/55c5ps. This is not to be missed!

From the official conference description:

Much of the data in a database is about people. Identity 2.0 technologies will lower the friction for people to provide and easily move data about themselves online.

This fast paced keynote will offer a background on Identity 2.0, discuss current roadblocks and future opportunities, and explore the potential impacts these will have on databases.

———–
I have already blogged about this keynote at https://sheeri.org/liveblogging-who-is-the-dick-on-my-site/
Colin Charles also blogged about it
Do not miss this keynote! See it on youtube.

Liveblogging: Who is the Dick on My Site?

Identity 2.0: A world that’s simple, safe and secure.

Who is the Dick on My Site? by Dick Hardt (Sxip Identity Corporation)

Quotes:
“Really, data is about people. It’s really identity data.”

“Identity helps you predict behavior.”

“Identity is who you are.”

“Identity is also what you like.”

“Identity enables you to uniquely identify somebody.”

“There are things that other people say about you, too.”

“Modern identity is about photo IDs so you can prove your identity.”

“Identity is a complicated issue….Everyone has a different idea of what it is.”

Identity transactions are:

  • party identification (who)
  • authorization (permission)
  • profile exchange (info about that person)
  • NOT record matching

Identity transactions can be: (more…)

Identity 2.0: A world that’s simple, safe and secure.

Who is the Dick on My Site? by Dick Hardt (Sxip Identity Corporation)

Quotes:
“Really, data is about people. It’s really identity data.”

“Identity helps you predict behavior.”

“Identity is who you are.”

“Identity is also what you like.”

“Identity enables you to uniquely identify somebody.”

“There are things that other people say about you, too.”

“Modern identity is about photo IDs so you can prove your identity.”

“Identity is a complicated issue….Everyone has a different idea of what it is.”

Identity transactions are:
party identification (who)
authorization (permission)
profile exchange (info about that person)
NOT record matching
Identity transactions can be:
verbal
but it’s unverified
need trust
How do you verify?
ID, subject matches credential, assuming the feature that only the one person can use that ID.

Photo ID is asymmetrical in trust, because the issuing organization (province of British Columbia) doesn’t know when the ID is being used, so there’s some privacy.

What is digital identity?
sometimes, site registration.
definitely a hassle, could be simpler
unverified, fewer trust cues than verbal
Interesting point — searching de.li.cio.us shows you what other people think you are.

How do you prove to a website who you are? It’s not what you give to the site, but what the site knows about you! If you have a good eBay rating, can you take that over to Craigslist?

What we want in Identity 2.0 is a way to make identity user-centric, not site-centric, so a person can move their identity around.

How do we solve this? You have a trusted agent that can give information to relying parties — a relying party is any site that the user wants to share information. The agent does not need to trust the relying party, the sites don’t need to trust the agent. The relying party does need to trust the agent (“issuer”), but that’s it. This is how OpenID works.

Identity data isn’t just data, it’s data about a person.

Why does identity matter?

“The future has arrived, it is just not evenly distributed yet.” William Gibson

More and more apps are becoming distributed (ie, Google). Biometrics are becoming prevalent. There’s a lot of device convergence — a phone can pay for things, etc.

There are “digital natives” and “digital immigrants” — natives grew up with the computer, with the internet. An immigrant has an accent — “digital camera” for an immigrant, “camera” for a native.

Identity 2.0 predictions:
minimal passwords — the agent makes it simpler
rich portable profiles — don’t need to keep re-writing the profile information over and over
portable credentials — digital driver’s licence, prove attributes digitally
agency/delegation — an assistant can book a flight for you, or one site can get
reputation services — like blogosphere, page rank, great contributor to wikis or open source. Similar to credit rating.
identity services — disposable e-mail, one-time tokens, such as one-time payments, one-time phone numbers, all this stuff can help reduce spam and protect privacy.
State of user-centric identity:
functionality — there is nothing out there that’s functional out there for what we need
industry — many organizations are working together, that wouldn’t normally – Grade: A
standards — needs more work – Grade: C
interop — standards not quite there, but folks are making it work – Grade: B
deployment — there’s a start, but more needed – Grade: C
utilization — nominal – Grade: D probably should be F

vitamins — should take, but don’t
painkillers — don’t want to take, but do
viagra — want to take, probably shouldn’t
Identity 2.0 is still at the vitamin stage. There’s no pain.

CHAR() vs. VARCHAR()

So, a little gotcha:

The CHAR() and VARCHAR() types are different types. MySQL silently converts any CHAR() fields to VARCHAR() when creating a table with at least 1 VARCHAR() field.

http://dev.mysql.com/doc/refman/5.0/en/silent-column-changes.html

If any column in a table has a variable length, the entire row becomes variable-length as a result. Therefore, if a table contains any variable-length columns (VARCHAR, TEXT, or BLOB), all CHAR columns longer than three characters are changed to VARCHAR columns. This does not affect how you use the columns in any way; in MySQL, VARCHAR is just a different way to store characters. MySQL performs this conversion because it saves space and makes table operations faster.

However, that’s not entirely accurate. Because according to the manual page at http://dev.mysql.com/doc/refman/5.0/en/char.html:

As of MySQL 5.0.3, trailing spaces are retained when values are stored and retrieved, in conformance with standard SQL. Before MySQL 5.0.3, trailing spaces are removed from values when they are stored into a VARCHAR column; this means that the spaces also are absent from retrieved values.

If you have a field such as name, and require it to not be blank, you probably have some function testing it before it goes into the database. However, most languages are perfectly happy that ” ” isn’t blank. When it gets put into the database, however, it becomes blank if your column is a VARCHAR. Which means folks may be able to get beyond your requirement of a blank field, and actually store a blank field in the database (as opposed to storing a space or series of spaces).

The CHAR() and VARCHAR() types are different types. MySQL silently converts any CHAR() fields to VARCHAR() when creating a table with at least 1 VARCHAR() field.

http://dev.mysql.com/doc/refman/5.0/en/silent-column-changes.html

If any column in a table has a variable length, the entire row becomes variable-length as a result. Therefore, if a table contains any variable-length columns (VARCHAR, TEXT, or BLOB), all CHAR columns longer than three characters are changed to VARCHAR columns. This does not affect how you use the columns in any way; in MySQL, VARCHAR is just a different way to store characters. MySQL performs this conversion because it saves space and makes table operations faster.

However, that’s not entirely accurate. Because according to the manual page at http://dev.mysql.com/doc/refman/5.0/en/char.html:

As of MySQL 5.0.3, trailing spaces are retained when values are stored and retrieved, in conformance with standard SQL. Before MySQL 5.0.3, trailing spaces are removed from values when they are stored into a VARCHAR column; this means that the spaces also are absent from retrieved values.

If you have a field such as name, and require it to not be blank, you probably have some function testing it before it goes into the database. However, most languages are perfectly happy that ” ” isn’t blank. When it gets put into the database, however, it becomes blank if your column is a VARCHAR. Which means folks may be able to get beyond your requirement of a blank field, and actually store a blank field in the database (as opposed to storing a space or series of spaces).