Monthly Archives: July 2010

How-To: Reinstall rpm and yum without rpm/yum in CentOS/RHEL

We were trying to reinstall the Munin package last week in one of our production servers and I somehow managed to screw things up. I accidentally removed rpm (and since yum depends on it, it was removed also!). And since yum and rpm are now gone, we can’t install anything! Cr*p!

Desperate times indeed. Luckily, Google search somehow pointed me in the right direction. I stumbled on this site, [] and it gave me the answer that I was looking for 🙂

NOTE: Check first if what version of CentOS do you have and what architecture. URLs must be changed if you’re not using CentOS 4.1

These are the commands that I executed in shell:
$ mkdir /tmp/install && cd /tmp/install
$ wget -c
$ rpm2cpio rpm-4.3.3-9_nonptl.i386.rpm | cpio -dim
$ find . -type d -exec chmod 755 {} \;
$ tar cf - ./usr ./etc ./bin | (cd /; tar xvf -)
$ rpm --rebuilddb
$ rpm -i rpm-4.3.3-9_nonptl.i386.rpm

In the last line (above), rpm has to be re-installed to update the package version in CentOS db. rpm won’t install a .rpm package otherwise. You have to copy the rpm2cpio command from other existing CentOS/RHEL installations if you can’t find/run it.

To re-install yum, I ran these commands:
$ wget -c
$ wget -c
$ rpm -i rpm-python-4.3.3-9_nonptl.i386.rpm
$ rpm -i yum-2.2.1-1.centos4.noarch.rpm

And that’s how I cleaned-up my mess. And yes, the server is running CentOS 4.1 🙂

How-To: Fix service check time outs in Nagios + NRPE deployed in CentOS/RHEL 5

Once you get used to writing plug-ins in Nagios and the complexity of the plug-ins you write grows, you may encounter this error, service check timed out.

If some of your service checks have this problem, you can isolate the problem in these 3 values:

1. how slow is the plugin

    This is the first thing you should do. Check if how much time does your plugin needs before it can finish checking and provide an exit status. Log-on to the server you’re monitoring and run the plugin locally. Use the time command to measure.
    $ time /usr/lib/nagios/plugins/check_service

2. how short is NRPE’s patience

    Once you have the value (in seconds) in step #1, check your NRPE configuration in that same server . The default location of NRPE’s configuration is /etc/nagios/nrpe.cfg
    Find this parameter, command_timeout. The value of this parameter, in seconds, must be greater than the value that you’ve got in step #1.
    Once the parameter’s been set, restart the NRPE service (service nrpe restart).

3. how short is Nagios’ patience

    Nagios executes a command, check_nrpe, to connect to a NRPE service. check_nrpe has a timeout paramer, -t. This parameter must have a bigger value than the one you set in step #2.
    Log-on to your Nagios server and you can set this by opening the commands configuration file, /etc/nagios/objects/commands.cfg
    Find check_nrpe, and edit its command_line and set the -t parameter. For instance, if you want the timeout value to be 500 seconds, it will look like this:
    command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -t 500
    Restart the Nagios service afterwards (service nagios restart).

In most cases these 3 steps should do 🙂

How-To: Thwart brute force SSH attacks in CentOS/RHEL 5

UPDATE:  This was a good exercise but I decided to replace the script with denyhosts: In CentOS, just intall the EPEL repo first, then you can install it via yum.

This is one of the problems that my team encountered when we opened up a firewall for SSH connections. Brute force SSH attacks using botnets are just everywhere! And if you’re not careful, it’s quite a headache if one of your servers was compromised.

Lot of tips can be found in the Internet and this is the approach that I came up with based on numerous sites that I’ve read.

  1. strong passwords
    DUH! This is obvious but most people ignore it. Don’t be lazy.
  2. disable root access through SSH
    Most of the time, direct root access is not needed. Disabling it is highly recommended.

    • open /etc/ssh/sshd_config
    • enable and set this SSH config to no: PermitRootLogin no
    • restart SSH: service sshd restart
  3. limit users who can log-in through SSH
    Users who can use the SSH service can be specified. Botnets often use user names that were added by an application, so listing the users can lessen the vulnerability.

    • open /etc/ssh/sshd_config
    • enable and list the users with this SSH config: AllowUsers user1 user2 user3
    • restart SSH: service sshd restart
  4. use a script to automatically block malicious IPs
    Utilizing SSH daemon’s log file (in CentOS/RHEL, it’s in /var/log/secure), a simple script can be written that can automatically block malicious IPs using tcp_wrapper’s host.deny
    If AllowUsers is enabled, the SSH daemon will log invalid attempts in this format:
    sshd[8207]: User apache from not allowed because not listed in AllowUsers
    sshd[15398]: User ftp from not allowed because not listed in AllowUsers

    SSH also logs invalid attempts in this format:sshd[6419]: Failed password for invalid user zabbix from port 50962 ssh2Based on the information above, I came up with this script:

    # always exclude these IPs
    if [[ -e $tmp_list ]]
        rm $tmp_list
    # set the separator to new lines only
    # REGEX filter
    filter="^$(date +%b\\s*%e).+(not listed in AllowUsers|\
    Failed password.+invalid user)"
    for ip in $( pcregrep  $filter $file_log \
      | perl -ne 'if (m/from\s+([^\s]+)\s+(not|port)/) { print $1,"\n"; }' )
        if [[ $ip ]]
            echo "ALL: $ip" >> $tmp_list
    # reset
    unset IFS
    cat $file_host_deny >> $tmp_list
    sort -u $tmp_list  | pcregrep -v $exclude_ips > $file_host_deny

    I deployed the script in root’s crontab and set it to run every minute 🙂

There, of course YMMV. Always test deployments and I’m pretty sure there are a lot of other tools available 🙂