gnulinux [Braindump]

This is an old revision of the document!

Puppet runtime message triage

Let's say you get a message like this (this is a notice, but you'd break down an error the same way):

Notice: /Stage[main]/Splunk::Forwarder/File[/opt/splunkforwarder/etc/system/local/outputs.conf]/ensure: created
         ----- ----  ------  --------- ---- --------------------------------------------------  ------  -------
            \    \      \        \       \                           \                             \       `> Action's desired end state
             \    \      \        \       \                           \                             `> Action property
              \    \      \        \       \                           `> Resource name
               \    \      \        \       `> Resource type
                \    \      \        `> Manifest name
                 \    \      `> Module name
                  \    `> ???
                   `> ???

You can break it up into its (slash-delimited) component parts:

Stage[main]

Not sure on the details of this part yet.

Splunk::Forwarder

This means the module name is Splunk, and the manifest name is Forwarder

To find the module, first get the module path with:

puppet module list 2>/dev/null | awk '/^\// { print $1; }'

Then, for each directory in the module path, run this to find a module directory with the correct name. Convert the module name to lower case.

find <directory> -maxdepth 1 -type d -name <module> -print

Once you've found the module directory, the manifest file should be here:

<path>/<module>/manifests/<manifest>.pp

For example, this is how it might look in practice, given the above log message:

# puppet module list 2>/dev/null | awk '/^\// { print $1; }'
/etc/puppetlabs/code/modules
/opt/puppetlabs/puppet/modules
 
# for directory in /etc/puppetlabs/code/modules /opt/puppetlabs/puppet/modules; do find "${directory}" -maxdepth 1 -type d -name splunk -print; done
/etc/puppetlabs/code/modules/splunk
 
# ls -ld /etc/puppetlabs/code/modules/splunk/manifests/forwarder.pp
/etc/puppetlabs/code/modules/splunk/manifests/forwarder.pp

File[/opt/splunkforwarder/etc/system/local/outputs.conf]

Now that you know which file to look in, you need to find the specific resource in question. In this case, you're looking for a “file” resource named /opt/splunkforwarder/etc/system/local/outputs.conf.

On a side note: file resources are usually named after the file they manage, but that's not necessarily true. What's listed in the Puppet log message is the name of the resource, which doesn't technically have to be the filename. Just something to be aware of.

It should be pretty easy to find by grepping for the resource name (which usually looks like a filename); but in general you're looking for something like:

<type> { "<name>":...

…or in this case:

file { "/opt/splunkforwarder/etc/system/local/outputs.conf":...

ensure: created

You should be off and running at this point, but just to clarify; this is an “action” (not sure if that's the right Puppet term) that the resource performs (in this case, it's ensuring that the file exists). The actions will all be listed in the resource definition.

SSH file relay

Overview

This is the problem being solved for:

You have a file on a remote system named srchost
You want to copy that file to a second remote system named desthost
srchost and desthost cannot connect directly to each other using SSH
Your local machine can connect to both systems using SSH

This solution allows you to use your local system as a seamless relay. The advantages of this over just doing two separate rsync/scp transfers are:

It takes less time because you're not waiting for the download to finish before starting the upload
The actual transfer speed is faster because the data are never written to disk

Pre-transfer recon

First, get the size of the file you're transferring, in bytes, from srchost. This step isn't technically necessary but if it's a large file you definitely want to have this information, since it'll allow you to have an ETA for the transfer.

ssh srchost "du --bytes file.gz"

Doing the transfer

You can log out of srchost now if you like (if you're short on screen space or something). To do the transfer, you'll SSH to srchost and cat the file, with cat's stdout wired up to pv's stdin, and then pv's stdout wired up to the stdin of another ssh process which is taking that stdin and writing it to a file on desthost:

ssh srchost "cat file.gz" | pv -s 903242 | ssh desthost "cat - > file.gz"

The -s <bytes> argument passed to pv is the size of the file being transferred (since you're using a pipe, pv can't determine the final size on its own). If you don't know the size and forgot to get it (with ls, du, etc.) then you can leave that argument off and pv will still display the transfer speed; you just won't get an ETA. If you don't have pv installed…well, you should really install it for stuff like this but if you really don't have it then you can just leave it out of the pipeline entirely. The transfer will still work the same way, you just won't get any progress/rate display. You can always log in to the destination system and monitor the file size there, I guess.

Here's a script that should make this pretty painless. Believe it or not it actually has rudimentary resume capability, since it checks the destination file size to see if something's there already and uses tail to start re-transferring at the right byte. I still wouldn't trust it any further than I could throw it, but it might save your bacon. Just re-run the same command line and it'll automatically resume (it's designed to be idempotent). If you do have to resume, I'd run an md5sum or something on the file on both systems to make sure they match; it worked when I tried it (I killed the same transfer three or four times to test) but I make no binding promises!

ssh-relay-transfer.sh

#!/bin/sh
 
## In order for this script to work, the following must be installed (except
## for pv, these should all be in your base systems):
##
## On the local host       : ssh expr pv
## On the source host      : ssh du cut tail
## On the destination host : ssh du cut touch cat
##
## Exit status is 0 on success, 2 if the destination file is the same size
## as the source file.  If one of the commands fails, it'll probably exit
## with a 1.  The external commands might exit 2 as well; if you have verbose
## on, then you'll see a message on stderr from this script if it's exiting 2
## because the files were the same size (so if you have an exitstatus of 2 and
## no message, you know it came from an external command).
 
## Hostnames of the two servers, as you would pass them to the SSH client
source_host="srchost"
destination_host="desthost"
 
## These filenames are going straight to the SSH client, so relative paths
## are relative to your home directory on each remote system.
##
## There's no reason the file in question has to be a .gz file, I'm just using
## that extension as an example; any file will work the same way.
##
## You can leave the destination filename empty and the source filename will
## be used (including any path)
source_filename="testfile.gz"
#destination_filename=""
 
## Set this to >0 to have the script print some diagnostics to stderr
verbose=1
 
###############################################################################
###############################################################################
 
## Retrieve the source file's size
source_file_size=$(ssh "${source_host}" "du --bytes ${source_filename} | cut --fields 1")
 
## Use source filename for destination of we don't already have one
[ -z "${destination_filename}" ] && destination_filename="${source_filename}"
 
## Retrieve the destination file's size
destination_file_size=$(ssh "${destination_host}" "du --bytes ${destination_filename} 2>/dev/null | cut --fields 1")
[ -z "${destination_file_size}" ] && destination_file_size=0
 
if [ ${destination_file_size} -eq ${source_file_size} ]; then
    ## Files are the same size, we must be done; exit with a unique status
    [ ${verbose} -ge 1 ] && echo "File sizes are the same; looks like we're done." >&2
   exit 2
fi
 
if [ ${destination_file_size} -gt 0 ]; then
    byte_offset=$(expr ${destination_file_size} + 1)
    transfer_size=$(expr ${source_file_size} - ${destination_file_size})
else
    byte_offset=0
    transfer_size=${source_file_size}
fi
 
if [ ${verbose} -ge 1 ]; then
    ## Print diagnostics to stderr
    echo "Source                : ${source_host}:${source_filename}"                       >&2
    echo "Destination           : ${destination_host}:${destination_filename}"             >&2
    echo "Source file size      : ${source_file_size}"                                     >&2
    echo "Destination file size : ${destination_file_size}"                                >&2
    echo "Byte offset           : ${byte_offset}"                                          >&2
    echo "Transfer size         : ${transfer_size}"                                        >&2
    echo ""                                                                                >&2
    echo "###############################################################################" >&2
    echo ""                                                                                >&2
fi
 
## Release the hounds!
ssh "${source_host}" "tail -c +${byte_offset} ${source_filename}" | pv -s ${transfer_size} | ssh "${destination_host}" "touch ${destination_filename}; cat - >> ${destination_filename}"
 
## EOF
########

NTP troubleshooting

The "reach" column

This column (from the output of ntpdc -p) shows the success of the most recent synchronization attempts as an octal number. The short answer is that a completely unreachable server will have 0 in this column, and the ideal value in the standard configuration is 377. You can use bc and/or Python to convert the octal number into its binary equivalent (for this example, what appears in the “reach” column is 175):

% echo 'obase=2; ibase=8; 175' | bc
1111101
% python -c 'print("{0:08b}".format(0175))'
01111101

There are a couple of things to note about these commands:

For bc: there are no leading zeroes in the output; that's why bc's output in this example is seven digits and the output from the Python one-liner is eight digits. The answer is still correct, you just have to remember to watch for that when interpreting the output.
For Python: you need to add the leading zero to the input value (i.e. to make 0175). That's standard notation for octal (in the same way that 0x is the prefix for hexadecimal and 0b is the prefix for binary) and is how Python knows that the input number is in octal instead of decimal. (You can add the leading zero in the bc command as well if you like, but there it's just a leading zero and bc will strip it…bc knows it's octal because of the ibase=8 statement.)

In both cases, the output means that the eight most recent sync attempts went as follows: a failure, followed by five successes, followed by a failure, and finally the most recent attempt which succeeded (you read the digits left-to-right).

The reference documentation for this feature is in RFC 1305 ("Network Time Protocol"), section 3.2.3 ("Peer Variables").

Hardware vendors

Sources:

http://www.smallbusinesscomputing.com/News/Hardware/5-top-linux-computer-vendors-for-small-business.html

VirtualBox

Based on https://forums.virtualbox.org/viewtopic.php?f=6&t=38646#p173539

Adapter	Mode	State	Purpose
1	NAT	on	Internet Connectivity
2	Host-only	on	Private communication between host and guest that still works with no other networking
3	Bridged	off	Inbound connectivity from other LAN machines
4	Internal	off	Private connectivity between VMs

KDE

Screensaver

Disable image transition effects in the slideshow module:

echo "EffectsEnabled=false" >> ~/.kde/share/config/kslideshow.kssrc

Troubleshooting/fixing Akonadi database corruption

mysql.err

150521 10:41:58 [Note] Plugin 'FEDERATED' is disabled.
150521 10:41:58 InnoDB: The InnoDB memory heap is disabled
150521 10:41:58 InnoDB: Mutexes and rw_locks use GCC atomic builtins
150521 10:41:58 InnoDB: Compressed tables use zlib 1.2.3.4
150521 10:41:58 InnoDB: Initializing buffer pool, size = 80.0M
150521 10:41:58 InnoDB: Completed initialization of buffer pool
InnoDB: Error: checksum mismatch in data file ./ibdata1
150521 10:41:58 InnoDB: Could not open or create data files.
150521 10:41:58 InnoDB: If you tried to add new data files, and it failed here,
150521 10:41:58 InnoDB: you should now edit innodb_data_file_path in my.cnf back
150521 10:41:58 InnoDB: to what it was, and remove the new ibdata files InnoDB created
150521 10:41:58 InnoDB: in this failed attempt. InnoDB only wrote those files full of
150521 10:41:58 InnoDB: zeros, but did not yet use them in any way. But be careful: do not
150521 10:41:58 InnoDB: remove old data files which contain your precious data!
150521 10:41:58 [ERROR] Plugin 'InnoDB' init function returned error.
150521 10:41:58 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
150521 10:41:58 [ERROR] Unknown/unsupported storage engine: innodb
150521 10:41:58 [ERROR] Aborting

150521 10:41:58 [Note] /usr/sbin/mysqld: Shutdown complete

Base command lines

/usr/sbin/mysqld-akonadi --defaults-file=/home/john/.local/share/akonadi/mysql.conf --datadir=/home/john/.local/share/akonadi/db_data/ --socket=/home/john/.local/share/akonadi/socket-wopr/mysql.socket
mysqldump --defaults-file=/home/john/.local/share/akonadi/mysql.conf --socket=/home/john/.local/share/akonadi/socket-wopr/mysql.socket --events --flush-privileges
mysql --defaults-file=/home/john/.local/share/akonadi/mysql.conf --socket=/home/john/.local/share/akonadi/socket-wopr/mysql.socket
mysqladmin --defaults-file=/home/john/.local/share/akonadi/mysql.conf --socket=/home/john/.local/share/akonadi/socket-wopr/mysql.socket

Troubleshooting

$ innochecksum -v /home/john/.local/share/akonadi/db_data/ibdata1
file /home/john/.local/share/akonadi/db_data/ibdata1 = 169869312 bytes (10368 pages)...
checking pages in range 0 to 10367
page 0 invalid (fails log sequence number check)
$ grep InnoDB ~/.local/share/akonadi/db_data/*/*.frm
Binary file /home/john/.local/share/akonadi/db_data/akonadi/collectionattributetable.frm matches
Binary file /home/john/.local/share/akonadi/db_data/akonadi/collectionmimetyperelation.frm matches
Binary file /home/john/.local/share/akonadi/db_data/akonadi/collectionpimitemrelation.frm matches
Binary file /home/john/.local/share/akonadi/db_data/akonadi/collectiontable.frm matches
Binary file /home/john/.local/share/akonadi/db_data/akonadi/flagtable.frm matches
Binary file /home/john/.local/share/akonadi/db_data/akonadi/mimetypetable.frm matches
Binary file /home/john/.local/share/akonadi/db_data/akonadi/parttable.frm matches
Binary file /home/john/.local/share/akonadi/db_data/akonadi/pimitemflagrelation.frm matches
Binary file /home/john/.local/share/akonadi/db_data/akonadi/pimitemtable.frm matches
Binary file /home/john/.local/share/akonadi/db_data/akonadi/resourcetable.frm matches
Binary file /home/john/.local/share/akonadi/db_data/akonadi/schemaversiontable.frm matches
$ /usr/sbin/mysqld-akonadi --defaults-file=/home/john/.local/share/akonadi/mysql.conf --datadir=/home/john/.local/share/akonadi/db_data/ --socket=/home/john/.local/share/akonadi/socket-wopr/mysql.socket --innodb-force-recovery=1
$ mysqldump --defaults-file=/home/john/.local/share/akonadi/mysql.conf --socket=/home/john/.local/share/akonadi/socket-wopr/mysql.socket --events --flush-privileges akonadi | gzip -1 > akonadi.sql.gz
$ mysql --defaults-file=/home/john/.local/share/akonadi/mysql.conf --socket=/home/john/.local/share/akonadi/socket-wopr/mysql.socket
mysql> drop database akonadi;
Query OK, 11 rows affected (0.71 sec)

mysql> create database akonadi;
Query OK, 1 row affected (0.00 sec)

mysql> \q
$ gzip -dc akonadi.sql.gz | mysql --defaults-file=/home/john/.local/share/akonadi/mysql.conf --socket=/home/john/.local/share/akonadi/socket-wopr/mysql.socket akonadi
ERROR 1030 (HY000) at line 42: Got error -1 from storage engine
$ # line 42 is the first insert statement after creating the first table
$ # read error message about how you can't alter tables with --force-recovery...OK, but you'll let me drop a database?  NOT COOL BRO
$ mysqladmin --defaults-file=/home/john/.local/share/akonadi/mysql.conf --socket=/home/john/.local/share/akonadi/socket-wopr/mysql.socket shutdown
$ /usr/sbin/mysqld-akonadi --defaults-file=/home/john/.local/share/akonadi/mysql.conf --datadir=/home/john/.local/share/akonadi/db_data/ --socket=/home/john/.local/share/akonadi/socket-wopr/mysql.socket 
$ gzip -dc akonadi.sql.gz | mysql --defaults-file=/home/john/.local/share/akonadi/mysql.conf --socket=/home/john/.local/share/akonadi/socket-wopr/mysql.socket akonadi
$ # success!

Dell Remote Access Controller (DRAC) cheatsheet

Enumerating usernames

This is useful when you need to figure out which ID is root so you can change its password:

#!/bin/sh

user_records=$(racadm get 'iDRAC.Users' | cut -d ' ' -f 1)

for user_record in ${user_records}; do
    username=$(racadm get "${user_record}" | awk -F '=' '/^UserName=/ { print $2; }')
    printf "%-14s %s\n" ${user_record} ${username:-<unset>}
done

Changing a user's password

The “2” in this example is the user ID, which you can find from the above enumeration loop (although “root” is usually ID 2).

racadm set 'iDRAC.Users.2.Password' "${new_drac_password}"

Getting rid of console logging

Systemd journal

Update /etc/systemd/journald.conf:

Set ForwardToConsole to no
If you're not turning it off entirely, set TTYPath
- The default is /dev/console…perhaps set to /dev/tty1?

Kernel messages

Disable printing kernel messages to the console: dmesg -D (https://superuser.com/a/793692/128124)

Kernel command line: loglevel=0 (https://stackoverflow.com/questions/16390004)

Syslog

Edit /etc/rsyslog.conf and remove lines that send things to the console; look for lines targeting /dev/console or /dev/sysmsg, for example.

https://serverfault.com/questions/392299/syslog-written-on-console