Liveblogging: Senior Skills: Python for Sysadmins

Why Python?

– Low WTF per minute factor
– Passes the 6-month test (if you write python code, going back in 6 months, you pretty much know what you were trying to do)
– Small Shift/no-Shift ratio (ie, you use the “Shift” key a lot in Perl because you use $ % ( ) { } etc, so you can tell what something is by context, not by $ or %)
– It’s hard to make a mess
– Objects if you need them, ignore them if you don’t.


Basics
Here’s a sample interpreter session. The >>> is the python prompt, and the … is the second/subsequent line prompt:

>>> x='hello, world!';
>>> x.upper()
'HELLO, WORLD!'
>>> def swapper(mystr):
... return mystr.swapcase()
  File "<stdin>", line 2
    return mystr.swapcase()
         ^
IndentationError: expected an indented block

You need to put a space on the second line because whitespace ‘tabbing’ is enforced in Python:

>>> def swapper(mystr):
...  return mystr.swapcase()
...
>>> swapper(x)
'HELLO, WORLD!'
>>> x
'hello, world!'

Substrings
partition is how to get substrings based on a separator:

>>> def parts(mystr, sep=','):
...  return mystr.partition(sep)
...
>>> parts(x, ',')
('hello', ',', ' world!')

You can replace text, too, using replace.

>>> def personalize(greeting, name='Brian'):
...  """Replaces 'world' with a given name"""
...  return greeting.replace('world', name)
...
>>> personalize(x, 'Brian')
'hello, Brian!'

By the way, the stuff in the triple quotes is automatic documentation. A double underscore, also called a “dunder”, is to print the stuff in the triple quotes:

>>> print personalize.__doc__
Replaces 'world' with a given name

Loop over a list of functions and do that function to some data:

>>> funclist=[swapper, personalize, parts]
>>> for func in funclist:
...  func(x)
...
'HELLO, WORLD!'
'hello, Brian!'
('hello', ',', ' world!')

Lists

>>> v=range(1,10)
>>> v
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> v[1]
2
>>> v[5]
6
>>> v[-1]
9
>>> v[-3]
7

List slicing with “:”
>>> v[:2]
[1, 2]
>>> v[4:]
[5, 6, 7, 8, 9]
>>> v[4:9]
[5, 6, 7, 8, 9]
Note that there’s no error returned even though there’s no field 9. If you did v[9], you’d get an error:
>>> v[9]
Traceback (most recent call last):
File ““, line 1, in
IndexError: list index out of range

Python uses pointers (or pointer-like things) so v[1:-1] does not print the first and last values:

>>> v[1:-1]
[2, 3, 4, 5, 6, 7, 8]

The full array syntax is [start:end:index increment]:

>>> v[::2]
[1, 3, 5, 7, 9]
>>> v[::-1]
[9, 8, 7, 6, 5, 4, 3, 2, 1]
>>> v[1:-1:4]
[2, 6]
>>> v[::3]
[1, 4, 7]

Make an array of numbers with range

>>> l=range(10)
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Make a list from another list

>>> [pow(num,2) for num in l]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

append appends to the end of a list

>>> l.append( [pow(num,2) for num in l])
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]]
>>> l.pop()
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

extend takes a sequence and puts it at the end of the array.

>>> l.extend([pow(num,2) for num in l])
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

A list can be made of a transformation, an iteration and optional filter:
[ i*i for i in mylist if i % 2 == 0]
transformation is i*i
iteration is for i in mylist
optional filter is if i % 2 == 0

>>> L=range(1,6)
>>> L
[1, 2, 3, 4, 5]
>>> [ i*i for i in L if i % 2 == 0]
[4, 16]

Tuples
Tuples are immutable lists, and they use () instead of []
A tuple always has 2 elements, so a one-item tuple is defined as
x=(1,)

Dictionaries aka associative arrays/hashes:

>>> d = {'user':'jonesy', 'room':'1178'}
>>> d
{'user': 'jonesy', 'room': '1178'}
>>> d['user']
'jonesy'
>>> d.keys()
['user', 'room']
>>> d.values()
['jonesy', '1178']
>>> d.items()
[('user', 'jonesy'), ('room', '1178')]
>>> d.items()[0]
('user', 'jonesy')
>>> d.items()[0][1]
'jonesy'
>>> d.items()[0][1].swapcase()
'JONESY'

There is no order to dictionaries, so don’t rely on it.

Quotes and string formatting
– You can use single and double quotes inside each other
– Inside triple quotes, you can use single and double quotes
– Variables are not recognized in strings, uses printf-style string formatting:

>>> word='World'
>>> punc='!'
>>> print "Hello, %s%s" % (word, punc)
Hello, World!

Braces, semicolons, indents
– Use indents instead of braces
– End-of-line instead of semicolons

if x == y:
 print "x == y"
for k,v in mydict.iteritems():
 if v is None:
  continue
 print "v has a value: %s" % v

This seems like it might be problematic because of long blocks of code, but apparently code blocks don’t get that long. You can also use folds in vim [now I need to look up what folds in vim are].

You can’t assign a value in a conditional statement’s expression — because you can’t use an = sign. This is on purpose, it avoids bugs resulting from typing if x=y instead of if x==y.

The construct has no place in production code anyway, since you give up catching any exceptions.

Python modules for sysadmins:
– sys
– os
– urlib/urlib2
– time, datetime (and calendar)
– fileinput
– stat
– filecmp
– glob (to use wildcards)
– shutil
– gzip
– tarfile
– hashlib, md5, crypt
– logging
– curses
– smtplib and email
– cmd

The Zen of Python
To get this, type ‘python’ in a unix environment, then type ‘import this’ at the commandline. I did this on my Windows laptop running Cygwin:

cabral@pythianbos2 ~
$ python
Python 2.5.2 (r252:60911, Dec  2 2008, 09:26:14)
[GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

This was liveblogged, please let me know any issues, as they may be typos….

Why Python?

– Low WTF per minute factor
– Passes the 6-month test (if you write python code, going back in 6 months, you pretty much know what you were trying to do)
– Small Shift/no-Shift ratio (ie, you use the “Shift” key a lot in Perl because you use $ % ( ) { } etc, so you can tell what something is by context, not by $ or %)
– It’s hard to make a mess
– Objects if you need them, ignore them if you don’t.


Basics
Here’s a sample interpreter session. The >>> is the python prompt, and the … is the second/subsequent line prompt:

>>> x='hello, world!';
>>> x.upper()
'HELLO, WORLD!'
>>> def swapper(mystr):
... return mystr.swapcase()
  File "<stdin>", line 2
    return mystr.swapcase()
         ^
IndentationError: expected an indented block

You need to put a space on the second line because whitespace ‘tabbing’ is enforced in Python:

>>> def swapper(mystr):
...  return mystr.swapcase()
...
>>> swapper(x)
'HELLO, WORLD!'
>>> x
'hello, world!'

Substrings
partition is how to get substrings based on a separator:

>>> def parts(mystr, sep=','):
...  return mystr.partition(sep)
...
>>> parts(x, ',')
('hello', ',', ' world!')

You can replace text, too, using replace.

>>> def personalize(greeting, name='Brian'):
...  """Replaces 'world' with a given name"""
...  return greeting.replace('world', name)
...
>>> personalize(x, 'Brian')
'hello, Brian!'

By the way, the stuff in the triple quotes is automatic documentation. A double underscore, also called a “dunder”, is to print the stuff in the triple quotes:

>>> print personalize.__doc__
Replaces 'world' with a given name

Loop over a list of functions and do that function to some data:

>>> funclist=[swapper, personalize, parts]
>>> for func in funclist:
...  func(x)
...
'HELLO, WORLD!'
'hello, Brian!'
('hello', ',', ' world!')

Lists

>>> v=range(1,10)
>>> v
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> v[1]
2
>>> v[5]
6
>>> v[-1]
9
>>> v[-3]
7

List slicing with “:”
>>> v[:2]
[1, 2]
>>> v[4:]
[5, 6, 7, 8, 9]
>>> v[4:9]
[5, 6, 7, 8, 9]
Note that there’s no error returned even though there’s no field 9. If you did v[9], you’d get an error:
>>> v[9]
Traceback (most recent call last):
File ““, line 1, in
IndexError: list index out of range

Python uses pointers (or pointer-like things) so v[1:-1] does not print the first and last values:

>>> v[1:-1]
[2, 3, 4, 5, 6, 7, 8]

The full array syntax is [start:end:index increment]:

>>> v[::2]
[1, 3, 5, 7, 9]
>>> v[::-1]
[9, 8, 7, 6, 5, 4, 3, 2, 1]
>>> v[1:-1:4]
[2, 6]
>>> v[::3]
[1, 4, 7]

Make an array of numbers with range

>>> l=range(10)
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Make a list from another list

>>> [pow(num,2) for num in l]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

append appends to the end of a list

>>> l.append( [pow(num,2) for num in l])
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]]
>>> l.pop()
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

extend takes a sequence and puts it at the end of the array.

>>> l.extend([pow(num,2) for num in l])
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

A list can be made of a transformation, an iteration and optional filter:
[ i*i for i in mylist if i % 2 == 0]
transformation is i*i
iteration is for i in mylist
optional filter is if i % 2 == 0

>>> L=range(1,6)
>>> L
[1, 2, 3, 4, 5]
>>> [ i*i for i in L if i % 2 == 0]
[4, 16]

Tuples
Tuples are immutable lists, and they use () instead of []
A tuple always has 2 elements, so a one-item tuple is defined as
x=(1,)

Dictionaries aka associative arrays/hashes:

>>> d = {'user':'jonesy', 'room':'1178'}
>>> d
{'user': 'jonesy', 'room': '1178'}
>>> d['user']
'jonesy'
>>> d.keys()
['user', 'room']
>>> d.values()
['jonesy', '1178']
>>> d.items()
[('user', 'jonesy'), ('room', '1178')]
>>> d.items()[0]
('user', 'jonesy')
>>> d.items()[0][1]
'jonesy'
>>> d.items()[0][1].swapcase()
'JONESY'

There is no order to dictionaries, so don’t rely on it.

Quotes and string formatting
– You can use single and double quotes inside each other
– Inside triple quotes, you can use single and double quotes
– Variables are not recognized in strings, uses printf-style string formatting:

>>> word='World'
>>> punc='!'
>>> print "Hello, %s%s" % (word, punc)
Hello, World!

Braces, semicolons, indents
– Use indents instead of braces
– End-of-line instead of semicolons

if x == y:
 print "x == y"
for k,v in mydict.iteritems():
 if v is None:
  continue
 print "v has a value: %s" % v

This seems like it might be problematic because of long blocks of code, but apparently code blocks don’t get that long. You can also use folds in vim [now I need to look up what folds in vim are].

You can’t assign a value in a conditional statement’s expression — because you can’t use an = sign. This is on purpose, it avoids bugs resulting from typing if x=y instead of if x==y.

The construct has no place in production code anyway, since you give up catching any exceptions.

Python modules for sysadmins:
– sys
– os
– urlib/urlib2
– time, datetime (and calendar)
– fileinput
– stat
– filecmp
– glob (to use wildcards)
– shutil
– gzip
– tarfile
– hashlib, md5, crypt
– logging
– curses
– smtplib and email
– cmd

The Zen of Python
To get this, type ‘python’ in a unix environment, then type ‘import this’ at the commandline. I did this on my Windows laptop running Cygwin:
Continue reading “Liveblogging: Senior Skills: Python for Sysadmins”

Liveblogging: Senior Skills: Grok awk

[author’s note: personally, I use awk a bunch in MySQL DBA work, for tasks like scrubbing data from a production export for use in qa/dev, but usually have to resort to Perl for really complex stuff, but now I know how to do .]

Basics:
By default, fields are separated by any number of spaces. The -F option to awk changes the separator on commandline.
Print the first field, fields are separated by a colon.
awk -F: '{print $1}' /etc/passwd

Print the first and fifth field:
awk -F: '{$print $1,$5}' /etc/passwd

Can pattern match and use files, so you can replace:
grep foo /etc/passwd | awk -F: '{print $1,$5}'
with:
awk -F: '/foo/ {print $1,$5}' /etc/passwd

NF = built in variable (no $) used to mean “field number”
This will print the first and last fields of lines where the first field matches “foo”
awk -F: '$1 ~/foo/ {print $1,$NF}' /etc/passwd

NF = number of fields, ie, “7″
$NF = value of last field, ie “/bin/bash”
(similarly, NR is record number)


Awk makes assumptions about input, variables, and processing that you’d otherwise have to code yourself.

– “main loop” of input processing is done for you
– awk initializes variables for you, to 0
– input is viewed by awk as ‘records’ which are splittable into ‘fields’

This all makes a lot of operations very concise in awk, many things can be done w/ a one-liner that would otherwise require several lines of code.

awk key points:
– splits text into fields
– default delimiter is “any number of spaces”
– reference fields
– $0 is entire line
– create filters using ‘addresses’ which can be regexps (similar to sed)
– Turing-complete language
– has if, while, for, do-while, etc
– built-in math like exp, log, rand, sin, cos
– built-in string sub, split, index, toupper/lower

Patterns and actions
Pattern is first, then action(s)
Actions are enclosed in {}

only a pattern, no action:
'length>42'
but, the default action is to print the whole line, so this will actually do something — print lines where the length of the line is > 42. strings are just arrays in awk

only action, no pattern:
{print $2,$1}
do this to all lines of input

NR % 3 == 0
print every third line (pattern is %NR mod 3)

{print $1, $NF, $(NF-1)}
print the first field, last field, and 2nd to last field

built-in variables
NF, NR we’ve done
FS = field separator (can be regexp)
OFMT = output format for numbers (default %.6g)

Patterns
– used to filter lines processed by awk
– can be regexp
/^root/ is the pattern in the following
awk -F:'/^root/ {print $1,$NF}' /etc/passwd

– Patterns can use fields and relational operators
To print 1st, 4th and last field if value of 4th field >10:
awk -F: '$4 > 10 {print $1, $4, $NF}' /etc/passwd

awk -F: '$0 !~ /^#/ && $4 > 10 {print $1, $4, $NF}' /etc/passwd

Range patterns
sed-like addressing : you can have start and end addresses
awk ‘NR==1,NR==3′
prints only first three lines of the file
You can use regular expressions in range patterns:
awk -F:’/^root/,/^daemon/ {print $1,$NF}’ /etc/passwd
start printing at the line that starts with “root”, the last line that is processed is the line starting with “daemon”

Range pattern “gotcha” – can’t mix a range with other patterns:
To do “start at non-commented line where value of $4 is less than 10, end at the first line where value of $4 is greater than 10″

This does not work!
awk -F: '$0 !~ /^#/ $4 <= 10, $4 > 10' /etc/passwd

This is how to do it, {next} is an action that skips:
awk -F: '$0 ~ /^#/ {next} $4 <= 10, $4 > 10 {print $1, $4' /etc/passwd

Basic Aggregation
awk -F: ‘$3 > 100 {x+=1; print x}’ /etc/passwd
This gives a line of output as each matching line is processed. This gives a running total of x.

awk -F: ‘$3 > 100 {x+=1} END {print x}’ /etc/passwd
This processes the “{print x}” action only after the entire file has been processed. This gives only the final value of x.

Arrays:
Support for regular arrays
Technically multi-dimensional arrays are not supported, but array indexes are not supported, so you can make your own associative arrays.

Example:
awk -F: ‘{x[$1] = $2*($4 – $3)} END {for(key in x) {print key, x[key]}}’ stocks.txt

The part before the END creates the associative array, the part after the END prints the array.

Extreme data munging:
awk -f: '{x[$1]=($2'($4 - $3))} END {for(z in x) {print z, x[z]}}' stocks.txt

ABC,100,12.14,19.12
FOO,100,24.01,17.45

output
BAR 271.5
ABC 698

For the line “ABC,100,12.14,19.12″
the function becomes

x[ABC] = 100 * (19.12 - 12.14) = 698

Aggregate across multiple variables:
awk -F, '{x[$1] = $2*($4 - $3); y+=x[$1]} END {for(z in x) {print z, x[z]}} {print "Net:"y}}' stocks.txt

Note that y is a running *sum* (not a running count like before).

Now, the above is hard to read, this is much easier.

#!/usr/bin/awk -f

BEGIN { FS="," }
{ x[$1] = $2*($4 - $3)
  y+=x[$1]
 }
END {
 for(z in x) {
  print z, x[z]
  }
 }  # end for loop
 {
 print "Net:"y
 } # end END block

This was liveblogged, so please point out any issues, as they may be typos on my part….

[author’s note: personally, I use awk a bunch in MySQL DBA work, for tasks like scrubbing data from a production export for use in qa/dev, but usually have to resort to Perl for really complex stuff, but now I know how to do .]

Basics:
By default, fields are separated by any number of spaces. The -F option to awk changes the separator on commandline.
Print the first field, fields are separated by a colon.
awk -F: '{print $1}' /etc/passwd

Print the first and fifth field:
awk -F: '{$print $1,$5}' /etc/passwd

Can pattern match and use files, so you can replace:
grep foo /etc/passwd | awk -F: '{print $1,$5}'
with:
awk -F: '/foo/ {print $1,$5}' /etc/passwd

NF = built in variable (no $) used to mean “field number”
This will print the first and last fields of lines where the first field matches “foo”
awk -F: '$1 ~/foo/ {print $1,$NF}' /etc/passwd

NF = number of fields, ie, “7″
$NF = value of last field, ie “/bin/bash”
(similarly, NR is record number)


Awk makes assumptions about input, variables, and processing that you’d otherwise have to code yourself.

– “main loop” of input processing is done for you
– awk initializes variables for you, to 0
– input is viewed by awk as ‘records’ which are splittable into ‘fields’

This all makes a lot of operations very concise in awk, many things can be done w/ a one-liner that would otherwise require several lines of code.

awk key points:
– splits text into fields
– default delimiter is “any number of spaces”
– reference fields
– $0 is entire line
– create filters using ‘addresses’ which can be regexps (similar to sed)
– Turing-complete language
– has if, while, for, do-while, etc
– built-in math like exp, log, rand, sin, cos
– built-in string sub, split, index, toupper/lower

Patterns and actions
Pattern is first, then action(s)
Actions are enclosed in {}

only a pattern, no action:
'length>42'
but, the default action is to print the whole line, so this will actually do something — print lines where the length of the line is > 42. strings are just arrays in awk

only action, no pattern:
{print $2,$1}
do this to all lines of input

NR % 3 == 0
print every third line (pattern is %NR mod 3)

{print $1, $NF, $(NF-1)}
print the first field, last field, and 2nd to last field

built-in variables
NF, NR we’ve done
FS = field separator (can be regexp)
OFMT = output format for numbers (default %.6g)

Patterns
– used to filter lines processed by awk
– can be regexp
/^root/ is the pattern in the following
awk -F:'/^root/ {print $1,$NF}' /etc/passwd

– Patterns can use fields and relational operators
To print 1st, 4th and last field if value of 4th field >10:
awk -F: '$4 > 10 {print $1, $4, $NF}' /etc/passwd

awk -F: '$0 !~ /^#/ && $4 > 10 {print $1, $4, $NF}' /etc/passwd

Range patterns
sed-like addressing : you can have start and end addresses
awk ‘NR==1,NR==3′
prints only first three lines of the file
You can use regular expressions in range patterns:
awk -F:’/^root/,/^daemon/ {print $1,$NF}’ /etc/passwd
start printing at the line that starts with “root”, the last line that is processed is the line starting with “daemon”

Range pattern “gotcha” – can’t mix a range with other patterns:
To do “start at non-commented line where value of $4 is less than 10, end at the first line where value of $4 is greater than 10″

This does not work!
awk -F: '$0 !~ /^#/ $4 <= 10, $4 > 10' /etc/passwd

This is how to do it, {next} is an action that skips:
awk -F: '$0 ~ /^#/ {next} $4 <= 10, $4 > 10 {print $1, $4' /etc/passwd

Basic Aggregation
awk -F: ‘$3 > 100 {x+=1; print x}’ /etc/passwd
This gives a line of output as each matching line is processed. This gives a running total of x.

awk -F: ‘$3 > 100 {x+=1} END {print x}’ /etc/passwd
This processes the “{print x}” action only after the entire file has been processed. This gives only the final value of x.

Arrays:
Support for regular arrays
Technically multi-dimensional arrays are not supported, but array indexes are not supported, so you can make your own associative arrays.

Example:
awk -F: ‘{x[$1] = $2*($4 – $3)} END {for(key in x) {print key, x[key]}}’ stocks.txt

The part before the END creates the associative array, the part after the END prints the array.

Extreme data munging:
awk -f: '{x[$1]=($2'($4 - $3))} END {for(z in x) {print z, x[z]}}' stocks.txt

ABC,100,12.14,19.12
FOO,100,24.01,17.45

output
BAR 271.5
ABC 698

For the line “ABC,100,12.14,19.12″
the function becomes

x[ABC] = 100 * (19.12 - 12.14) = 698

Aggregate across multiple variables:
awk -F, '{x[$1] = $2*($4 - $3); y+=x[$1]} END {for(z in x) {print z, x[z]}} {print "Net:"y}}' stocks.txt

Note that y is a running *sum* (not a running count like before).

Now, the above is hard to read, this is much easier.

#!/usr/bin/awk -f

BEGIN { FS="," }
{ x[$1] = $2*($4 - $3)
  y+=x[$1]
 }
END {
 for(z in x) {
  print z, x[z]
  }
 }  # end for loop
 {
 print "Net:"y
 } # end END block

This was liveblogged, so please point out any issues, as they may be typos on my part….

Liveblogging: Senior Skills: Sysadmin Patterns

The Beacon Pattern:
– This is a “Get out of the business” pattern
– Identify an oft-occurring and annoying task
– Automate and document it to the point of being able to hand it off to someone far less technical

Example:
– System admins were being put in charge of scheduling rooms in the building
– They wrote a PHP web application to help them automate the task
– They refined the app, documented how to use it, and handed it off to a secretary
– They have to maintain the app, but it’s far less work.

The Community Pattern:

– Prior to launch of a new service, create user documentation for it.
– Point a few early adopters at the documentation and see if they can use the service with minimal support
– Use feedback to improve documentation, and the service
– Upon launch, create a mailing list, forum, IRC channel, or Jabber chat room and ask early adopters to help test it out.
– Upon launch, your early adopters are the community, and they’ll tell new users to use the tools you’ve provided instead of calling you.

Example:
– A beowulf cluster for an academic department
– Documented like crazy, early adopters were given early access to the cluster (demand was high)
– Crated a mailing list, early adopters were added to it with their consent, functionality was tested with them.
– Email announcing launch mentioned the early adopters in a ‘thank you’ secion, and linked them to their mailing list.

The DRY pattern
DRY = Don’t repeat yourself
Identify duplicate code in your automation scripts
Put subroutines that exist in an include file, and include them in your scripts.

Example:
– “sysadmin library”
– /var/lib/adm/.*pl
– Elapsed time and # of lines to script a task for which the library was useful plunged dramatically
– new tasks were thought up that were not considered before but were obvious now (ie, users that want to change their username)
– migrating to new services became much easier

The Chameleon Pattern
– Identify commonalities among your services
– Leverage those to create “Chameleon” servers that can be re-purposed on the fly
– Abstract as much of this away from the physical hardware
– Doesn’t need to involve virtualization, though it’s awfully handy if you can do it that way.
[this one is a bit harder to do with MySQL config files]

Example:
[puppet/cfengine were mentioned…]
ldapconfig.py – more than a script: a methodology

– But isn’t installing packages you don’t need bad? Depends on the package….ie, gcc is bad for enterprise

“Junior annoynances”

Terminal issues

Junior:
open terminal, login to machine1
think issue is with machine2, talks to machine1.
log out of machine1
log into machine2

Senior:
opens 2 terminals each of machine1 and machine2 to start

Junior:
networking issue ticket arrives
logs into server
runs tcpdump

Senior:
networking issue ticket arrives
logs into server
looks at logs

“Fix” vs. “Solution” ie “taking orders”
Junior will try fix a problem, senior will try to figure out what the problem is. ie, “I need a samba directory mounted under an NFS mount” a junior admin will try to do exactly that, a senior admin will ask “what are you trying to do with that?” because maybe all they need is a symlink.

Fanboyism
Signs you might be a fanboy:
– Disparaging users of latest stable release of $THING for not using the nightly (unstable) build which fixes more issues
– Creating false/invalid comparisons based on popular opinion instead of experience/facts
– Going against internal standards, breaking environmental consistency, to use $THING instead of $STANDARD (but this is also how disruptive technology works)
– Being in complete denial that most technology at some point or another stinks.
– Evaluating solutions based on “I like” instead of “we need” and “this does”.

The Beacon Pattern:
– This is a “Get out of the business” pattern
– Identify an oft-occurring and annoying task
– Automate and document it to the point of being able to hand it off to someone far less technical

Example:
– System admins were being put in charge of scheduling rooms in the building
– They wrote a PHP web application to help them automate the task
– They refined the app, documented how to use it, and handed it off to a secretary
– They have to maintain the app, but it’s far less work.

The Community Pattern:

– Prior to launch of a new service, create user documentation for it.
– Point a few early adopters at the documentation and see if they can use the service with minimal support
– Use feedback to improve documentation, and the service
– Upon launch, create a mailing list, forum, IRC channel, or Jabber chat room and ask early adopters to help test it out.
– Upon launch, your early adopters are the community, and they’ll tell new users to use the tools you’ve provided instead of calling you.

Example:
– A beowulf cluster for an academic department
– Documented like crazy, early adopters were given early access to the cluster (demand was high)
– Crated a mailing list, early adopters were added to it with their consent, functionality was tested with them.
– Email announcing launch mentioned the early adopters in a ‘thank you’ secion, and linked them to their mailing list.

The DRY pattern
DRY = Don’t repeat yourself
Identify duplicate code in your automation scripts
Put subroutines that exist in an include file, and include them in your scripts.

Example:
– “sysadmin library”
– /var/lib/adm/.*pl
– Elapsed time and # of lines to script a task for which the library was useful plunged dramatically
– new tasks were thought up that were not considered before but were obvious now (ie, users that want to change their username)
– migrating to new services became much easier

The Chameleon Pattern
– Identify commonalities among your services
– Leverage those to create “Chameleon” servers that can be re-purposed on the fly
– Abstract as much of this away from the physical hardware
– Doesn’t need to involve virtualization, though it’s awfully handy if you can do it that way.
[this one is a bit harder to do with MySQL config files]

Example:
[puppet/cfengine were mentioned…]
ldapconfig.py – more than a script: a methodology

– But isn’t installing packages you don’t need bad? Depends on the package….ie, gcc is bad for enterprise

“Junior annoynances”

Terminal issues

Junior:
open terminal, login to machine1
think issue is with machine2, talks to machine1.
log out of machine1
log into machine2

Senior:
opens 2 terminals each of machine1 and machine2 to start

Junior:
networking issue ticket arrives
logs into server
runs tcpdump

Senior:
networking issue ticket arrives
logs into server
looks at logs

“Fix” vs. “Solution” ie “taking orders”
Junior will try fix a problem, senior will try to figure out what the problem is. ie, “I need a samba directory mounted under an NFS mount” a junior admin will try to do exactly that, a senior admin will ask “what are you trying to do with that?” because maybe all they need is a symlink.

Fanboyism
Signs you might be a fanboy:
– Disparaging users of latest stable release of $THING for not using the nightly (unstable) build which fixes more issues
– Creating false/invalid comparisons based on popular opinion instead of experience/facts
– Going against internal standards, breaking environmental consistency, to use $THING instead of $STANDARD (but this is also how disruptive technology works)
– Being in complete denial that most technology at some point or another stinks.
– Evaluating solutions based on “I like” instead of “we need” and “this does”.

Liveblogging: Seeking Senior and Beyond

I am attending the Professional IT Community Conference – it is put on by the League of Professional System Administrators (LOPSA), and is a 2-day community conference. There are technical and “soft” topics — the audience is system administrators. While technical topics such as Essential IPv6 for Linux Administrators are not essential for my job, many of the “soft” topics are directly applicable and relevant to DBAs too. (I am speaking on How to Stop Hating MySQL tomorrow.)

So I am in Seeking Senior and Beyond: The Tech Skills That Get You Promoted. The first part talks about the definition of what it means to be senior, and it completely relates to DBA work:
works and plays well with other
understands “ability”
leads by example
lives to share knowledge
understands “Service”
thoughtful of the consequences of their actions
understands projects
cool under pressure

Good Qualities:
confident
empathetic
humane
personal
forthright
respectful
thorough

Bad Qualities:
disrespective
insensitive
incompetent
[my own addition – no follow through, lack of attention to detail]

The Dice/Monster Factor – what do job sites see as important for a senior position?

They back up the SAGE 5-year experience requirement
Ability to code in newer languages (Ruby/Python) is more prevalent (perhaps cloud-induced?)

The cloud allows sysadmin tasks to be done by anyone…..so developers can do sysadmin work, and you end up seeing schizophrenic job descriptions such as

About the 5-year requirement:
– Senior after 5 years? What happens after 10 years?
– Most electricians, by comparison, haven’t even completed an *apprenticeship* in 5 years.

Senior Administrators Code
– not just 20-line shell scripts
– coding skills are part of a sysadmin skill
– ability to code competently *is* a factor that separates juniors from seniors
– hiring managers expect senior admins to be competent coders.

If you are not a coder
– pick a language, any language
– do not listen to fans, find one that fits how you think, they all work…..
– …that being said, some languages are more practical than others (ie, .NET probably is not the best language to learn if you are a Unix sysadmin).

Popular admin languages:
– Perl: classic admin scripting language. Learn at least the basics, because you will see it in any environment that has been around for more than 5 years.

– Ruby: object-oriented language for people who mostly like Perl (except for its OO implementation)

– Python: object-oriented language for people who mostly hate Perl, objects or no objects. For example, you don’t have to create a String object to send an output.

But what if you do not have time to learn how to program?

– senior admins are better at managing their time than junior admins, so perhaps managing time
– time management means you’ll have more time to do things, it doesn’t mean all work work work.
– Read Time Management for System Administrators – there is Google Video of a presentation by the author, Tom Limoncelli.

Consider “The Cloud”
– starting to use developer APIs to perform sysadmin tasks, so learning programming is good.
– still growing, could supplant large portions of datacenter real estate
– a coder with sysadmin knowledge: Good
– a sysadmin with coding knowledge: Good
– a coder without sysadmin knowledge: OK
– a sysadmin with no coding interest/experience: Tough place to be in

Senior Admins Have Problems Too
Many don’t document or share knowledge
Maany don’t do a good job keeping up with their craft
Cannot always be highlighted as an example of how to deal with clients
Often reinvent the wheel – also usually there is no repository
Often don’t progress beyond the “senior admin” role

….on the other hand…..
cynicism can be good…..

Advice:
learn from the good traits
observe how others respond to their bad traits
think about how you might improve upon that
strive to work and play well with others, even if you don’t have a mentor for good/bad examples.

Now he’s going into talking about Patterns in System Administration….

I am attending the Professional IT Community Conference – it is put on by the League of Professional System Administrators (LOPSA), and is a 2-day community conference. There are technical and “soft” topics — the audience is system administrators. While technical topics such as Essential IPv6 for Linux Administrators are not essential for my job, many of the “soft” topics are directly applicable and relevant to DBAs too. (I am speaking on How to Stop Hating MySQL tomorrow.)

So I am in Seeking Senior and Beyond: The Tech Skills That Get You Promoted. The first part talks about the definition of what it means to be senior, and it completely relates to DBA work:
works and plays well with other
understands “ability”
leads by example
lives to share knowledge
understands “Service”
thoughtful of the consequences of their actions
understands projects
cool under pressure

Good Qualities:
confident
empathetic
humane
personal
forthright
respectful
thorough

Bad Qualities:
disrespective
insensitive
incompetent
[my own addition – no follow through, lack of attention to detail]

The Dice/Monster Factor – what do job sites see as important for a senior position?

They back up the SAGE 5-year experience requirement
Ability to code in newer languages (Ruby/Python) is more prevalent (perhaps cloud-induced?)

The cloud allows sysadmin tasks to be done by anyone…..so developers can do sysadmin work, and you end up seeing schizophrenic job descriptions such as

About the 5-year requirement:
– Senior after 5 years? What happens after 10 years?
– Most electricians, by comparison, haven’t even completed an *apprenticeship* in 5 years.

Senior Administrators Code
– not just 20-line shell scripts
– coding skills are part of a sysadmin skill
– ability to code competently *is* a factor that separates juniors from seniors
– hiring managers expect senior admins to be competent coders.

If you are not a coder
– pick a language, any language
– do not listen to fans, find one that fits how you think, they all work…..
– …that being said, some languages are more practical than others (ie, .NET probably is not the best language to learn if you are a Unix sysadmin).

Popular admin languages:
– Perl: classic admin scripting language. Learn at least the basics, because you will see it in any environment that has been around for more than 5 years.

– Ruby: object-oriented language for people who mostly like Perl (except for its OO implementation)

– Python: object-oriented language for people who mostly hate Perl, objects or no objects. For example, you don’t have to create a String object to send an output.

But what if you do not have time to learn how to program?

– senior admins are better at managing their time than junior admins, so perhaps managing time
– time management means you’ll have more time to do things, it doesn’t mean all work work work.
– Read Time Management for System Administrators – there is Google Video of a presentation by the author, Tom Limoncelli.

Consider “The Cloud”
– starting to use developer APIs to perform sysadmin tasks, so learning programming is good.
– still growing, could supplant large portions of datacenter real estate
– a coder with sysadmin knowledge: Good
– a sysadmin with coding knowledge: Good
– a coder without sysadmin knowledge: OK
– a sysadmin with no coding interest/experience: Tough place to be in

Senior Admins Have Problems Too
Many don’t document or share knowledge
Maany don’t do a good job keeping up with their craft
Cannot always be highlighted as an example of how to deal with clients
Often reinvent the wheel – also usually there is no repository
Often don’t progress beyond the “senior admin” role

….on the other hand…..
cynicism can be good…..

Advice:
learn from the good traits
observe how others respond to their bad traits
think about how you might improve upon that
strive to work and play well with others, even if you don’t have a mentor for good/bad examples.

Now he’s going into talking about Patterns in System Administration….

MySQL Track at Kaleidoscope

On Monday, Ronald Bradford posted that the independent Oracle Developer Tools User Group had opened up their Kaleidoscope Conference, well-known throughout the Oracle community for in-depth technical sessions for developers, to the MySQL community. Giuseppe Maxia posted his thoughts on Tuesday.

We have confirmed that there will be an entire MySQL track at Kaleidoscope! Because Kaleidoscope is less than 8 weeks away, we could not go through a standard call for papers. Ronald and I have been working to come up with appropriate topics and speakers for an audience that uses MySQL but is probably more familiar with Oracle. We contacted folks we thought would be interested, and who we thought could make it logistically, as the conference is in Washington, D.C.

We have (almost) finalized the list of speakers; the session abstracts will be finalized in the next few days. You can see the speakers at Kaleidoscope’s MySQL page, but I’ve also listed them below (alpha by last name):

Philip Antoniades, Sun/MySQL
Ronald Bradford, 42SQL
Sheeri K. Cabral, The Pythian Group
Laine Campbell, PalominoDB
Patrick Galbraith, Northscale
Sarah Novotny, Blue Gecko
Padraig O’Sullivan, Akiba Technologies Inc.
Jay Pipes, Rackspace Cloud
Dossy Shiobara, Panoptic.com
Matt Yonkovit, Percona

There are one or two more speakers we are waiting to hear back from. There will be 19 sessions, so some speakers will have more than one session.

I am very excited that MySQL has its own track at Kaleidoscope. In addition, Ronald and I will be able to attend our very first event as Oracle ACE Directors – the Sundown Sessions are a Birds-of-a-Feather-type discussion, with the Oracle ACE Directors being the panelists and the community asking questions. Immediately after the Sundown Sessions is a “Meet the Oracle ACE” event, the only part of the conference officially sponsored by Oracle.

On Monday, Ronald Bradford posted that the independent Oracle Developer Tools User Group had opened up their Kaleidoscope Conference, well-known throughout the Oracle community for in-depth technical sessions for developers, to the MySQL community. Giuseppe Maxia posted his thoughts on Tuesday.

We have confirmed that there will be an entire MySQL track at Kaleidoscope! Because Kaleidoscope is less than 8 weeks away, we could not go through a standard call for papers. Ronald and I have been working to come up with appropriate topics and speakers for an audience that uses MySQL but is probably more familiar with Oracle. We contacted folks we thought would be interested, and who we thought could make it logistically, as the conference is in Washington, D.C.

We have (almost) finalized the list of speakers; the session abstracts will be finalized in the next few days. You can see the speakers at Kaleidoscope’s MySQL page, but I’ve also listed them below (alpha by last name):

Philip Antoniades, Sun/MySQL
Ronald Bradford, 42SQL
Sheeri K. Cabral, The Pythian Group
Laine Campbell, PalominoDB
Patrick Galbraith, Northscale
Sarah Novotny, Blue Gecko
Padraig O’Sullivan, Akiba Technologies Inc.
Jay Pipes, Rackspace Cloud
Dossy Shiobara, Panoptic.com
Matt Yonkovit, Percona

There are one or two more speakers we are waiting to hear back from. There will be 19 sessions, so some speakers will have more than one session.

I am very excited that MySQL has its own track at Kaleidoscope. In addition, Ronald and I will be able to attend our very first event as Oracle ACE Directors – the Sundown Sessions are a Birds-of-a-Feather-type discussion, with the Oracle ACE Directors being the panelists and the community asking questions. Immediately after the Sundown Sessions is a “Meet the Oracle ACE” event, the only part of the conference officially sponsored by Oracle.

2010 O’Reilly MySQL Conference Slides and Videos

Here’s a matrix of all the videos up on YouTube for the 2010 O’Reilly MySQL Conference and Expo. The matrix includes the title, presenter, slide link (if it exists), video link, and link to the official conference detail page, where you can rate the session and provide feedback that the presenter will see. They are grouped mostly by topic, except for the main stage events (keynote, ignite) and interviews.

If there’s a detail missing (ie, slides, or there are other videos you know about), please add a comment so I can make this a complete matrix.






TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)

Keynotes

State of the DolphinEdward Screven (Oracle)29:10session 12440
O’Reilly RadarTim O’Reilly36:38session 12441
MySQL at FacebookMark Callaghan (Facebook)21:05session 14841
State of MariaDBMonty Widenius (Monty Program Ab)41:54session 12443
State of DrizzleBrian Aker (Data Differential)44:58session 12442
Keynote: Under New Management: Next Steps for the CommunitySheeri K. Cabral (Pythian)18:16session 14808
State of the MySQL CommunityKaj Arnö (Sun Microsystems GmbH)38:06session 12498
The Engines of CommunityJono Bacon (Canonical, Ltd)47:51session 14796
The Best of Ignite MySQLSarah Novotny, Gerry Narvaja, Gillian Gunson, Mark Atwood23:25
RethinkDB: Why Start a New Database Company in 2010Slava Akhmechet (RethinkDB), Michael Glukhovsky (RethinkDB)44:49session 14891

Ignite Talks

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Backups Don’t Make Me MoneySarah Novotny (Blue Gecko)6:52Also in Best of Ignite
Calpont’s InfiniDBRobin Schumacher (Calpont)6:40
A Future [for MySQL]Mark Callaghan (Facebook)6:37/TD>
A Guide to NoSQLBrian Aker (Data Differential)6:27
Guide to NoSQL, reduxMark Atwood (Gear6)4:22Also in Best of Ignite
The Gunson Rules of Database AdministrationGillian Gunson6:08Also in Best of Ignite
MariaDB: 20 slides, 5 minutes, the full MontyMonty Widenius (Monty Program Ab)6:18
MySQLtuner 2.0Sheeri K. Cabral (Pythian)PDF5:31
“Painting” Data with Entrance (free) and MySQLTod Landis (dbEntrance Software)5:11
Three Basic Laws of DB DiagnosisGerry Narvaja (OpenMarket, Inc)2:33Also in Best of Ignite
What is the difference between XtraDB and others?Baron Schwartz (Percona)6:59
What is a Performance Model for SSD’s?Bradley C. Kuszmaul (Tokutek)7:00

Interviews

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Why having InnoDB and MySQL in the same company will improve performance, the way Drizzle leaves the past behind, and other issues in MySQL Development. Kaj Arnö (MySQL)10:40
What’s hard to optimize in MySQL, how they’ve improved performance, and what’s in the performance schema.Peter Gulutzan3:01
Write-scaling, MySQL performance in an EC2 cloud, why they wrote the book MySQL High Availability.Charles Bell, Mats Kindahl, and Lars Thalmann7:04
How third-party ads make web sites slow, why mobile devices are the next frontier in Web performance.Steve Souders, Web performance expert7:12
Attractions of Gearman, the adaptation of database technology to large multi-core and multi-node environments, and what relational databases are and are not great for.Brian Aker9:01
Thoughts on Drizzle and MySQLSheeri K. Cabral (Pythian)9:22
Democratic culture of Monty Program ABHenrik Ingo (Monty Program AB)2:20
Thoughts on democratic companies and his role is in coding and managementMonty Widenius (Monty Program Ab)8:52
How MariaDB emerged as a superset of MySQL, and development issues.Kurt von Finck (MontyProgram Ab)7:23

Tutorials

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
MySQL Configuration Options and Files: Basic MySQL Variables (Part 1)Sheeri K. Cabral (Pythian)PDF1:25:04, pre-break

1:35:47, post-break
session 12408
MySQL Configuration Options and Files: Intermediate MySQL Variables (Part 2)Sheeri K. Cabral (Pythian)PDF1:25:04, pre-break

1:24:28, post-break
session 12435

Sessions

Performance

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Advanced Sharding Techniques with SpiderKentoku Shiba () and Daniel Saito (MySQL)39:55session 12619
Boosting Database Performance with GearmanEric Day (Rackspace Cloud), Giuseppe Maxia (MySQL)46:18session 13310
High Concurrency MySQLDomas Mituzas (Facebook)PDF49:53session 13285
High-throughput MySQLMark Callaghan (Facebook), Ryan Mack (Facebook), Ryan McElroy (Facebook)57:31session 13223
Introduction to InnoDB Monitoring System and Resource & Performance TuningJimmy Yang (Oracle Corporation)ZIP40:49session 13508
Linux Performance Tuning and Stabilization TipsYoshinori Matsunobu (Sun Microsystems)slideshare.net48:45session 13252

Debugging and Reactive/Proactive Monitoring

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Better Database Debugging for Shorter DowntimesRob Hamel (Pythian)PDF33:13
session 13021
Continual Replication SyncDanil Zburivsky (Pythian)ODP45:57session 13428
Find Query Problems Proactively With Query ReviewsSheeri K. Cabral (Pythian)PDF45:59session 13267
Monitoring Drizzle or MySQL With DTrace and SystemTapPadraig O’Sullivan (Akiba Technologies Inc.)PDF42:33session 12472

Security / Risk Management

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Achieving PCI Compliance with MySQLRyan Lowe (Percona)PPTX58:24session 12484
Security Around MySQLDanil Zburivsky (Pythian)ODP37:27session 13458
Securich – A Security and User Administration plugin for MySQLDarren Cassar (Trading Screen Inc)PDF54:05session 13351

Other DBA-related

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Galera – Synchronous Multi-master Replication For InnoDBSeppo Jaakola (Codership), Alexey Yurchenko (Codership) PDF47:39session 13286
Large Deployment Best PracticesNicklas Westerlund (Electronic Arts)40:37session 12567
MySQL Cluster: An IntroductionGeert Vanderkelen (Sun Microsystems)PDF47:30session 12469
Migration From Oracle to MySQL : An NPR Case StudyJoanne Garlow (National Public Radio)PPT34:35session 13404
New Replication FeaturesMats Kindahl (Sun Microsystems), Lars Thalmann (MySQL)PDF53:32session 12451
Successful and Cost Effective Data Warehouse… The MySQL WayIvan Zoratti (MySQL)PDF1:00:25session 13343
The Thinking Person’s Guide to Data Warehouse DesignRobin Schumacher (Calpont)slideshare.net59:50session 13366
Using DrizzleEric Day (Rackspace Cloud), Monty Taylor (Rackspace Cloud)58:21session 13308

Other Developer-related

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Connecting MySQL and PythonGeert Vanderkelen (Sun Microsystems)PDF54:56session 13251
MySQL Plugin API: New FeaturesSergei Golubchik (MariaDB)ZIP40:14session 13143
PHP Object-Relational Mapping Libraries In ActionFernando Ipar (Percona)PDF49:45session 12489
Scalability and Reliability Features of MySQL Connector/JMark Matthews (Oracle), Todd Farmer (Oracle Corporation)PDF39:07session 12448
Time Zones and MySQLSheeri K. Cabral (Pythian)PDF45:54session 12412
Using Visual Studio 2010MySQL Reggie Burnett (Oracle), Mike Frank (Oracle)ZIP37:53session 13365

Here’s a matrix of all the videos up on YouTube for the 2010 O’Reilly MySQL Conference and Expo. The matrix includes the title, presenter, slide link (if it exists), video link, and link to the official conference detail page, where you can rate the session and provide feedback that the presenter will see. They are grouped mostly by topic, except for the main stage events (keynote, ignite) and interviews.

If there’s a detail missing (ie, slides, or there are other videos you know about), please add a comment so I can make this a complete matrix.






TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)

Keynotes

State of the DolphinEdward Screven (Oracle)29:10session 12440
O’Reilly RadarTim O’Reilly36:38session 12441
MySQL at FacebookMark Callaghan (Facebook)21:05session 14841
State of MariaDBMonty Widenius (Monty Program Ab)41:54session 12443
State of DrizzleBrian Aker (Data Differential)44:58session 12442
Keynote: Under New Management: Next Steps for the CommunitySheeri K. Cabral (Pythian)18:16session 14808
State of the MySQL CommunityKaj Arnö (Sun Microsystems GmbH)38:06session 12498
The Engines of CommunityJono Bacon (Canonical, Ltd)47:51session 14796
The Best of Ignite MySQLSarah Novotny, Gerry Narvaja, Gillian Gunson, Mark Atwood23:25
RethinkDB: Why Start a New Database Company in 2010Slava Akhmechet (RethinkDB), Michael Glukhovsky (RethinkDB)44:49session 14891

Ignite Talks

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Backups Don’t Make Me MoneySarah Novotny (Blue Gecko)6:52Also in Best of Ignite
Calpont’s InfiniDBRobin Schumacher (Calpont)6:40
A Future [for MySQL]Mark Callaghan (Facebook)6:37/TD>
A Guide to NoSQLBrian Aker (Data Differential)6:27
Guide to NoSQL, reduxMark Atwood (Gear6)4:22Also in Best of Ignite
The Gunson Rules of Database AdministrationGillian Gunson6:08Also in Best of Ignite
MariaDB: 20 slides, 5 minutes, the full MontyMonty Widenius (Monty Program Ab)6:18
MySQLtuner 2.0Sheeri K. Cabral (Pythian)PDF5:31
“Painting” Data with Entrance (free) and MySQLTod Landis (dbEntrance Software)5:11
Three Basic Laws of DB DiagnosisGerry Narvaja (OpenMarket, Inc)2:33Also in Best of Ignite
What is the difference between XtraDB and others?Baron Schwartz (Percona)6:59
What is a Performance Model for SSD’s?Bradley C. Kuszmaul (Tokutek)7:00

Interviews

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Why having InnoDB and MySQL in the same company will improve performance, the way Drizzle leaves the past behind, and other issues in MySQL Development. Kaj Arnö (MySQL)10:40
What’s hard to optimize in MySQL, how they’ve improved performance, and what’s in the performance schema.Peter Gulutzan3:01
Write-scaling, MySQL performance in an EC2 cloud, why they wrote the book MySQL High Availability.Charles Bell, Mats Kindahl, and Lars Thalmann7:04
How third-party ads make web sites slow, why mobile devices are the next frontier in Web performance.Steve Souders, Web performance expert7:12
Attractions of Gearman, the adaptation of database technology to large multi-core and multi-node environments, and what relational databases are and are not great for.Brian Aker9:01
Thoughts on Drizzle and MySQLSheeri K. Cabral (Pythian)9:22
Democratic culture of Monty Program ABHenrik Ingo (Monty Program AB)2:20
Thoughts on democratic companies and his role is in coding and managementMonty Widenius (Monty Program Ab)8:52
How MariaDB emerged as a superset of MySQL, and development issues.Kurt von Finck (MontyProgram Ab)7:23

Tutorials

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
MySQL Configuration Options and Files: Basic MySQL Variables (Part 1)Sheeri K. Cabral (Pythian)PDF1:25:04, pre-break

1:35:47, post-break
session 12408
MySQL Configuration Options and Files: Intermediate MySQL Variables (Part 2)Sheeri K. Cabral (Pythian)PDF1:25:04, pre-break

1:24:28, post-break
session 12435

Sessions

Performance

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Advanced Sharding Techniques with SpiderKentoku Shiba () and Daniel Saito (MySQL)39:55session 12619
Boosting Database Performance with GearmanEric Day (Rackspace Cloud), Giuseppe Maxia (MySQL)46:18session 13310
High Concurrency MySQLDomas Mituzas (Facebook)PDF49:53session 13285
High-throughput MySQLMark Callaghan (Facebook), Ryan Mack (Facebook), Ryan McElroy (Facebook)57:31session 13223
Introduction to InnoDB Monitoring System and Resource & Performance TuningJimmy Yang (Oracle Corporation)ZIP40:49session 13508
Linux Performance Tuning and Stabilization TipsYoshinori Matsunobu (Sun Microsystems)slideshare.net48:45session 13252

Debugging and Reactive/Proactive Monitoring

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Better Database Debugging for Shorter DowntimesRob Hamel (Pythian)PDF33:13
session 13021
Continual Replication SyncDanil Zburivsky (Pythian)ODP45:57session 13428
Find Query Problems Proactively With Query ReviewsSheeri K. Cabral (Pythian)PDF45:59session 13267
Monitoring Drizzle or MySQL With DTrace and SystemTapPadraig O’Sullivan (Akiba Technologies Inc.)PDF42:33session 12472

Security / Risk Management

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Achieving PCI Compliance with MySQLRyan Lowe (Percona)PPTX58:24session 12484
Security Around MySQLDanil Zburivsky (Pythian)ODP37:27session 13458
Securich – A Security and User Administration plugin for MySQLDarren Cassar (Trading Screen Inc)PDF54:05session 13351

Other DBA-related

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Galera – Synchronous Multi-master Replication For InnoDBSeppo Jaakola (Codership), Alexey Yurchenko (Codership) PDF47:39session 13286
Large Deployment Best PracticesNicklas Westerlund (Electronic Arts)40:37session 12567
MySQL Cluster: An IntroductionGeert Vanderkelen (Sun Microsystems)PDF47:30session 12469
Migration From Oracle to MySQL : An NPR Case StudyJoanne Garlow (National Public Radio)PPT34:35session 13404
New Replication FeaturesMats Kindahl (Sun Microsystems), Lars Thalmann (MySQL)PDF53:32session 12451
Successful and Cost Effective Data Warehouse… The MySQL WayIvan Zoratti (MySQL)PDF1:00:25session 13343
The Thinking Person’s Guide to Data Warehouse DesignRobin Schumacher (Calpont)slideshare.net59:50session 13366
Using DrizzleEric Day (Rackspace Cloud), Monty Taylor (Rackspace Cloud)58:21session 13308

Other Developer-related

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Connecting MySQL and PythonGeert Vanderkelen (Sun Microsystems)PDF54:56session 13251
MySQL Plugin API: New FeaturesSergei Golubchik (MariaDB)ZIP40:14session 13143
PHP Object-Relational Mapping Libraries In ActionFernando Ipar (Percona)PDF49:45session 12489
Scalability and Reliability Features of MySQL Connector/JMark Matthews (Oracle), Todd Farmer (Oracle Corporation)PDF39:07session 12448
Time Zones and MySQLSheeri K. Cabral (Pythian)PDF45:54session 12412
Using Visual Studio 2010MySQL Reggie Burnett (Oracle), Mike Frank (Oracle)ZIP37:53session 13365

Videos of Pythian Sessions from the 2010 O’Reilly MySQL Conference and Expo

Here’s a sneak peek at a video matrix — this is all the videos that include Pythian Group employees at the MySQL conference. I hope to have all the rest of the videos processed and uploaded within 24 hours, with a matrix similar to the one below (but of course with many more sessions).

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Main Stage
Keynote: Under New Management: Next Steps for the CommunitySheeri K. Cabral (Pythian)N/A18:16
session 14808
Ignite talk: MySQLtuner 2.0Sheeri K. Cabral (Pythian)PDF5:31N/A
Interview
Thoughts on Drizzle and MySQLSheeri K. Cabral (Pythian)N/A9:22N/A
Tutorials
MySQL Configuration Options and Files: Basic MySQL Variables (Part 1)Sheeri K. Cabral (Pythian)
PDF
1:25:04, pre-break

1:35:47, post-break
session 12408
MySQL Configuration Options and Files: Intermediate MySQL Variables (Part 2)Sheeri K. Cabral (Pythian)
PDF
1:25:04, pre-break

1:24:28, post-break
session 12435
Sessions
Better Database Debugging for Shorter DowntimesRob Hamel (Pythian)PDF33:13
session 13021
Find Query Problems Proactively With Query ReviewsSheeri K. Cabral (Pythian)PDF45:59session 13267
Time Zones and MySQLSheeri K. Cabral (Pythian)PDF45:54
session 12412
Security Around MySQLDanil Zburivsky (The Pythian Group)ODP37:27session 13458
Continual Replication SyncDanil Zburivsky (The Pythian Group)ODP45:57session 13428

Here’s a sneak peek at a video matrix — this is all the videos that include Pythian Group employees at the MySQL conference. I hope to have all the rest of the videos processed and uploaded within 24 hours, with a matrix similar to the one below (but of course with many more sessions).

TitlePresenterSlidesVideo link
(hr:min:sec)
Details (Conf. site link)
Main Stage
Keynote: Under New Management: Next Steps for the CommunitySheeri K. Cabral (Pythian)N/A18:16
session 14808
Ignite talk: MySQLtuner 2.0Sheeri K. Cabral (Pythian)PDF5:31N/A
Interview
Thoughts on Drizzle and MySQLSheeri K. Cabral (Pythian)N/A9:22N/A
Tutorials
MySQL Configuration Options and Files: Basic MySQL Variables (Part 1)Sheeri K. Cabral (Pythian)
PDF
1:25:04, pre-break

1:35:47, post-break
session 12408
MySQL Configuration Options and Files: Intermediate MySQL Variables (Part 2)Sheeri K. Cabral (Pythian)
PDF
1:25:04, pre-break

1:24:28, post-break
session 12435
Sessions
Better Database Debugging for Shorter DowntimesRob Hamel (Pythian)PDF33:13
session 13021
Find Query Problems Proactively With Query ReviewsSheeri K. Cabral (Pythian)PDF45:59session 13267
Time Zones and MySQLSheeri K. Cabral (Pythian)PDF45:54
session 12412
Security Around MySQLDanil Zburivsky (The Pythian Group)ODP37:27session 13458
Continual Replication SyncDanil Zburivsky (The Pythian Group)ODP45:57session 13428

MySQL Conference Notes

This is not my notes about the MySQL conference that just occurred. These are my thoughts about MySQL conferences in general. Baron wrote in The History of OpenSQL Camp:

After O’Reilly/MySQL co-hosted MySQL Conference and Expo (a large commercial event) that year, there was a bit of dissatisfaction amongst a few people about the increasingly commercial and marketing-oriented nature of that conference. Some people refused to call the conference by its new name (Conference and Expo) and wanted to put pressure on MySQL to keep it a MySQL User’s Conference.

During this year’s conference, I heard a lot of concern about whether or not O’Reilly would have a MySQL conference, and whether or not Oracle would decide to sponsor. I heard all of the following (in no particular order):

* If O’Reilly does not have a conference, what will we do?
* Maybe [http://www.opensqlcamp.org OpenSQLCamp] can be bigger instead of having an O’Reilly conference, because the O’Reilly conference is more commercial.
* If Oracle does not sponsor the O’Reilly conference, it means they don’t care about MySQL/the MySQL community.
* If Oracle sponsors the O’Reilly conference, they’ll ruin it by making it even more commercial.
* Oracle shouldn’t sponsor the O’Reilly conference, they should make a different technical conference, in a different hotel/location and bigger (6,000 people instead of 2,000).
* Oracle shouldn’t make their own technical conference for MySQL, they should let user groups get together and then sponsor it, like they do with Collaborate.

Obviously there are mixed messages here — I don’t see any clear directive from the community. Plenty of people have a strong opinion. What I do see happening is that there will probably be plenty of options:

I know that OpenSQLCamp is not dead — there will be 2 this year, check the website for details.

I also know know that there will be a *real* MySQL track at Oracle OpenWorld — there was a rumor that the number of sessions would be fewer than 5, but sources on the inside have said that will not be the case.

I also know that we will hear from O’Reilly in the next few months about next year’s MySQL conference.

So, regardless of what happens, the nay-sayers will say how awful it is, and the pollyannas will say how great it is. There are plenty of reasons that each scenario is good and bad; so keep that in mind.

This is not my notes about the MySQL conference that just occurred. These are my thoughts about MySQL conferences in general. Baron wrote in The History of OpenSQL Camp:

After O’Reilly/MySQL co-hosted MySQL Conference and Expo (a large commercial event) that year, there was a bit of dissatisfaction amongst a few people about the increasingly commercial and marketing-oriented nature of that conference. Some people refused to call the conference by its new name (Conference and Expo) and wanted to put pressure on MySQL to keep it a MySQL User’s Conference.

During this year’s conference, I heard a lot of concern about whether or not O’Reilly would have a MySQL conference, and whether or not Oracle would decide to sponsor. I heard all of the following (in no particular order):

* If O’Reilly does not have a conference, what will we do?
* Maybe [http://www.opensqlcamp.org OpenSQLCamp] can be bigger instead of having an O’Reilly conference, because the O’Reilly conference is more commercial.
* If Oracle does not sponsor the O’Reilly conference, it means they don’t care about MySQL/the MySQL community.
* If Oracle sponsors the O’Reilly conference, they’ll ruin it by making it even more commercial.
* Oracle shouldn’t sponsor the O’Reilly conference, they should make a different technical conference, in a different hotel/location and bigger (6,000 people instead of 2,000).
* Oracle shouldn’t make their own technical conference for MySQL, they should let user groups get together and then sponsor it, like they do with Collaborate.

Obviously there are mixed messages here — I don’t see any clear directive from the community. Plenty of people have a strong opinion. What I do see happening is that there will probably be plenty of options:

I know that OpenSQLCamp is not dead — there will be 2 this year, check the website for details.

I also know know that there will be a *real* MySQL track at Oracle OpenWorld — there was a rumor that the number of sessions would be fewer than 5, but sources on the inside have said that will not be the case.

I also know that we will hear from O’Reilly in the next few months about next year’s MySQL conference.

So, regardless of what happens, the nay-sayers will say how awful it is, and the pollyannas will say how great it is. There are plenty of reasons that each scenario is good and bad; so keep that in mind.

Liveblogging: Edward Screven State of the Dolphin Keynote

Chief Corporate Architect at Oracle, been at Oracle since 1986, technology and architecture decisions, responsible for all open source at Oracle. Company-wide initiatives on standards management and security — http://en.oreilly.com/mysql2010/public/schedule/detail/12440.

Where MySQL fits within Oracle’s structure.

Oracle’s Strategy: Complete. Open. Integrated. (compare with MySQL’s strategy: Fast, Reliable, Easy to Use).

Most of the $$ spent by companies is not on software, but on integration. So Oracle makes software based on open standards that integrates well.

Most of the components talk to each other through open standards, so that customers can use other products, and standardize on the technology, which makes it much more likely that customers will continue to use Oracle.

Oracle invested heavily in open source even before the acquisition. Linux (Oracle Unbreakable Linux = Oracle Enterprise Linux = OEL). Clustering, data integrity, storage validation, asynchronous I/O, virtualiation technology that has been accepted back into the Linux kernel. They have enhanced Xen, in order to make a good Oracle VM server for x86. With Sun, they now have VirtualBox. In the 3 years of OEL, they have over 4,500 companies.

Oracle never settles for being second best at any level of the stack.
“Complete” means we meet most customer requirements at every level.
That’s why Oracle matters to Oracle and Oracle customers.

MySQL is small, lightweight, easy to install and easy to manage. These are different from Oracle, so MySQL is the RIGHT choice for many applications, so by adding MySQL to Oracle’s database offerings, it makes the Oracle solution more complete.

Investing in MySQL means:
making MySQL a better MySQL. Keep MySQL the #1 db for web apps.
improve enginnering consulting and support
24×7, world-class oracle support

MySQL community edition: “If we stop investing in the community edition, MySQL will stop being ubiquitous”.

They want to focus even more effort on:
web
embedded
telecom
integration with other products in the LAMP stack
Windows — #1 download platform is Windows, but it’s not the #1 *deployment* platform.

They want to invest more money in allowing Oracle tools to work with MySQL too. For example, Oracle Enterprise Manager for monitoring, Oracle Secure Backup for backups, and Oracle Audit Vault for auditing. (Pythian already has a free Oracle Grid Control plugin to monitor MySQL).

Oracle will keep pluggable storage engine API, they are starting a Storage Engine Advisory Board to talk about their requirements and experiences and future plans and product direction.

MySQL 5.5 is beta, that’s the big news. InnoDB is the default storage engine there.

5.5 is much faster….including more than 10x improvement in recovery times. There’s a 200% read-only 200% performance gain. Read/Write performance gain is 364% faster than MySQL 5.1.40. These are for large # of concurrent connections, like 1024 connections.

Better object/connection management, database administration, data modelling in MySQL workbench.

MySQL Cluster 7.1, improved administration, higher performance, java connectors, carrier grade availability and performance. “Extreme availability”.

They’re also making support better — MySQL Enterprise — bettter.

MySQL Enterprise Backup – formerly InnoDB hot backup. This is now included in MySQL Enterprise, not a separately paid for feature.

(Demo of MySQL enterprise manager)

In conclusion:
MySQL is important to Oracle and our customers — it’s part of Oracle’s complete, open, integrated strategy. Oracle is making MySQL better TODAY. A “come to Oracle OpenWorld pitch (I’ve been, it certainly is a great conference.)

Chief Corporate Architect at Oracle, been at Oracle since 1986, technology and architecture decisions, responsible for all open source at Oracle. Company-wide initiatives on standards management and security — http://en.oreilly.com/mysql2010/public/schedule/detail/12440.

Where MySQL fits within Oracle’s structure.

Oracle’s Strategy: Complete. Open. Integrated. (compare with MySQL’s strategy: Fast, Reliable, Easy to Use).

Most of the $$ spent by companies is not on software, but on integration. So Oracle makes software based on open standards that integrates well.

Most of the components talk to each other through open standards, so that customers can use other products, and standardize on the technology, which makes it much more likely that customers will continue to use Oracle.

Oracle invested heavily in open source even before the acquisition. Linux (Oracle Unbreakable Linux = Oracle Enterprise Linux = OEL). Clustering, data integrity, storage validation, asynchronous I/O, virtualiation technology that has been accepted back into the Linux kernel. They have enhanced Xen, in order to make a good Oracle VM server for x86. With Sun, they now have VirtualBox. In the 3 years of OEL, they have over 4,500 companies.

Oracle never settles for being second best at any level of the stack.
“Complete” means we meet most customer requirements at every level.
That’s why Oracle matters to Oracle and Oracle customers.

MySQL is small, lightweight, easy to install and easy to manage. These are different from Oracle, so MySQL is the RIGHT choice for many applications, so by adding MySQL to Oracle’s database offerings, it makes the Oracle solution more complete.

Investing in MySQL means:
making MySQL a better MySQL. Keep MySQL the #1 db for web apps.
improve enginnering consulting and support
24×7, world-class oracle support

MySQL community edition: “If we stop investing in the community edition, MySQL will stop being ubiquitous”.

They want to focus even more effort on:
web
embedded
telecom
integration with other products in the LAMP stack
Windows — #1 download platform is Windows, but it’s not the #1 *deployment* platform.

They want to invest more money in allowing Oracle tools to work with MySQL too. For example, Oracle Enterprise Manager for monitoring, Oracle Secure Backup for backups, and Oracle Audit Vault for auditing. (Pythian already has a free Oracle Grid Control plugin to monitor MySQL).

Oracle will keep pluggable storage engine API, they are starting a Storage Engine Advisory Board to talk about their requirements and experiences and future plans and product direction.

MySQL 5.5 is beta, that’s the big news. InnoDB is the default storage engine there.

5.5 is much faster….including more than 10x improvement in recovery times. There’s a 200% read-only 200% performance gain. Read/Write performance gain is 364% faster than MySQL 5.1.40. These are for large # of concurrent connections, like 1024 connections.

Better object/connection management, database administration, data modelling in MySQL workbench.

MySQL Cluster 7.1, improved administration, higher performance, java connectors, carrier grade availability and performance. “Extreme availability”.

They’re also making support better — MySQL Enterprise — bettter.

MySQL Enterprise Backup – formerly InnoDB hot backup. This is now included in MySQL Enterprise, not a separately paid for feature.

(Demo of MySQL enterprise manager)

In conclusion:
MySQL is important to Oracle and our customers — it’s part of Oracle’s complete, open, integrated strategy. Oracle is making MySQL better TODAY. A “come to Oracle OpenWorld pitch (I’ve been, it certainly is a great conference.)

Achievements of Women in Technology

Today is Ada Lovelace day, a day to “draw attention to achievements of women in technology.”

So here I am, drawing some attention 🙂 All the names contain links to learn more (mostly Wikipedia links), so if you are so inclined to do so, you can learn more (you could start at Wikipedia’s article on women in computing). Perhaps you will realize that there are lots of women in technology already, more than you first thought.

That being said, this is by no means a comprehensive list.


Of course, there’s Ada Lovelace herself, but I am focusing on women still alive today (although I do have to mention Grace Hopper, who coined the term “debugging”). As well, I might mention the amazing Allison Randal, well-known in the Perl community and one of the major organizers of OSCon. But I do want to focus on some of the great achievements of lesser-known women, because we are indeed hiding (in plain sight!) everywhere.

Did you like Apple’s Newton PDA? Many people believe it was (and still is) one of the best-designed PDA’s. Donna Auguste helped develop it.

Ever played the video game Centipede? Thank Dona Bailey. Corrinne Yu has done a lot of work in the gaming field, currently a Halo lead at Microsoft.

Wireshark and Ethereal, two of the more popular security tools, were written by Angela Orebaugh.

The first commercial website is credited to Jennifer Niederst Robbins, who designed the Global Network Navigator.

Mary Ann Davidson is the Chief Security Officer at Oracle.

Lynne Jolitz helped develop 386BSD.

Wendy Hall, current president of the ACM (since 2008).

IBM Master Inventor Amanda Chessell.

Elaine Weyuker’s Wikipedia page starts out with “Elaine J. Weyuker is an ACM Fellow, an IEEE Fellow, and an AT&T Fellow at Bell Labs for research in software metrics and testing as well as elected to the National Academy of Engineering. She is the author of over 130 papers in journals and refereed conference proceedings.” From there, it gets more impressive.

Having written a book myself, I can tell you it is definitely an achievement! Ruth Aylett’s popular work Robots: Bringing Intelligent Machines to Life certainly qualifies her to make this list.

I challenge all the readers out there to take a few minutes to note the achievements of women in technology and science in their life. A few weeks ago I posted a list of women who taught me science or technology, that may be an easier way for people to celebrate the day than researching the great women of science and technology….and so we will not see the same “top 10 women in science and technology” lists over and over today.

Today is Ada Lovelace day, a day to “draw attention to achievements of women in technology.”

So here I am, drawing some attention 🙂 All the names contain links to learn more (mostly Wikipedia links), so if you are so inclined to do so, you can learn more (you could start at Wikipedia’s article on women in computing). Perhaps you will realize that there are lots of women in technology already, more than you first thought.

That being said, this is by no means a comprehensive list.


Of course, there’s Ada Lovelace herself, but I am focusing on women still alive today (although I do have to mention Grace Hopper, who coined the term “debugging”). As well, I might mention the amazing Allison Randal, well-known in the Perl community and one of the major organizers of OSCon. But I do want to focus on some of the great achievements of lesser-known women, because we are indeed hiding (in plain sight!) everywhere.

Did you like Apple’s Newton PDA? Many people believe it was (and still is) one of the best-designed PDA’s. Donna Auguste helped develop it.

Ever played the video game Centipede? Thank Dona Bailey. Corrinne Yu has done a lot of work in the gaming field, currently a Halo lead at Microsoft.

Wireshark and Ethereal, two of the more popular security tools, were written by Angela Orebaugh.

The first commercial website is credited to Jennifer Niederst Robbins, who designed the Global Network Navigator.

Mary Ann Davidson is the Chief Security Officer at Oracle.

Lynne Jolitz helped develop 386BSD.

Wendy Hall, current president of the ACM (since 2008).

IBM Master Inventor Amanda Chessell.

Elaine Weyuker’s Wikipedia page starts out with “Elaine J. Weyuker is an ACM Fellow, an IEEE Fellow, and an AT&T Fellow at Bell Labs for research in software metrics and testing as well as elected to the National Academy of Engineering. She is the author of over 130 papers in journals and refereed conference proceedings.” From there, it gets more impressive.

Having written a book myself, I can tell you it is definitely an achievement! Ruth Aylett’s popular work Robots: Bringing Intelligent Machines to Life certainly qualifies her to make this list.

I challenge all the readers out there to take a few minutes to note the achievements of women in technology and science in their life. A few weeks ago I posted a list of women who taught me science or technology, that may be an easier way for people to celebrate the day than researching the great women of science and technology….and so we will not see the same “top 10 women in science and technology” lists over and over today.