Linux: Only two of four gluster servers are receiving write requests

I talked with some gluster admins concerning an issue I had with only two of the four servers answering write requests. I had originally created my cluster to only include two servers and two bricks. Even though I delete the original settings there is data store in gluster’s DHT(Distributed Hash Table). We proved this theory by having me create a new directory and then writing files into the new directory. The files dispersed as they were supposed to. The fix is to do a gluster volume rebalance on the volume with the issue. This did correct the problem.

Category: Linux | Comments Off

May 11

Linux: Gluster storage and replication options explained

One of the key issues I see with any storage clustering is the loss of available storage. If I have four servers and each server has a 1 TB drive, how I configure the cluster determines the amount of available storage space. It comes down to how much redundancy does a person want vs how much storage space.

Replication Examples:
4 servers
1 TB drive each server

Distributed(default) glusterfs volume:
gluster volume create test-volume transport tcp fscluster1:/exp1 fscluster2:/exp2 fscluster3:/exp1 fscluster4:/exp2

Total storage available = 4TB
Complete files are store on one of the four servers. If you lose a server, the files that were store on that server are now gone. All other files will remain on the servers that are still active.

Replicated gluster volume:
gluster volume create test-volume replica 4 transport tcp fscluster1:/exp1 fscluster2:/exp2 fscluster3:/exp1 fscluster4:/exp2

Total storage available = 1TB

When replicating with all of the servers, one loses a lot of available storage. In this case 3/4ths of my space is in use, but I have incredible redundancy. I can lose up to three servers and still have all of my data.

Distributed/Replicated gluster volume:
gluster volume create test-volume replica 2 transport tcp fscluster1:/exp1 fscluster2:/exp2 fscluster3:/exp1 fscluster4:/exp2

Total storage available = 2TB

Group1
fscluster1 and fscluster2 = 1TB
Group2
fscluster3 and fscluster4 = 1TB

In this example there are two groups of servers. The servers in a group are replicating the files with each other. You can lose any single server in the group and your files are still completely available. When using round RRDNS, files can end up stored on either group. From a gluster client perspective they appear to be on one hard drive. In reality any single file is on two hard drives, on two servers that are in the same single group. If a server in the group goes down then the single server will write the file to the other group.

Stripes
The next two types are striped volumes. Neither offers any file redundancy as the files are strip across the servers drives. Stripe volumes theoretically would be faster do to having more “arms” doing the work. I have not tested this yet.

Striped gluster volume:
gluster volume create test-volume stripe 4 transport tcp fscluster1:/exp1 fscluster2:/exp2 fscluster3:/exp1 fscluster4:/exp2
Total storage available = 1TB

A single file is spread across all four servers. All four servers need to be running to access all files.

Distributed striped glusterfs volume:
gluster volume create test-volume stripe 2 transport tcp fscluster1:/exp1 fscluster2:/exp2 fscluster3:/exp1 fscluster4:/exp2
Total storage available = 2TB

Group1
fscluster1 and fscluster2 = 1TB
Group2
fscluster3 and fscluster4 = 1TB

There are two groups. A single file will be stored in one of the two groups, but across both of the servers in the group. Both server in the group must be running to access the file that is store on it. When using round RRDNS, files can end up stored on either group. From a gluster client perspective they appear to be on one hard drive. In reality any single file is spread across two hard drives, on two servers that are in the same single group. If a server in the group goes down then the single server will write the file to the other group.

Category: Linux | Comments Off

May 10

Linux: Setting up a basic gluster storage cluster

Setup Centos server
Update the server
yum -y update

Assign IP

set the hostname
hostnamectl set-hostname fscluster1 –static

Disable SELINUX
nano /etc/sysconfig/selinux

Disable the Firewall
systemctl status firewalld
systemctl disable firewalld
systemctl stop firewalld
iptables -F

Reboot

DNS
At this point DNS is important. Make certain your servers ar in you local DNS server in order for them to talk to each other by short and FQDN. Use RRDNS to allow the client the ability to talk with all servers in a cluster.

Setup repos and get needed software
cd Downloads/
wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/glusterfs-epel.repo
wget http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-6.noarch.rpm
rpm -ivh epel-release-7-6.noarch.rpm
yum update -y

Install gluster
yum install glusterfs-server
service glusterd start
service glusterd status

Add glusterd to startup
systemctl enable glusterd

Create the directory structure for mounting the new drives (I am doing two new drives)
mkdir -p /data/brick1/gv0
mkdir -p /data/brick2/gv0

Setup File System
fdisk -l
fdisk /dev/sdb
n
p
1
Enter
Enter
w
mkfs.btrfs /dev/sdb1
fdisk /dev/sdc
n
p
1
Enter
Enter
w
mkfs.btrfs /dev/sdc1

Add new drives to fstab for automounting
echo “/dev/sdb1 /data/brick1 btrfs defaults 0 0” >> /etc/fstab
echo “/dev/sdc1 /data/brick2 btrfs defaults 0 0” >> /etc/fstab
mount -a
df -h

After doing this setup to at least one more server gluster can be configure.
From fscluster1
gluster peer probe fscluster2.domainname.local
gluster peer status
gluster pool list

Create volume
gluster volume create datavol1 replica 2 transport tcp fscluster1.domainname.local:/data/brick1/gv0 fscluster2.domainname.local:/data/brick1/gv0 force
gluster volume start datavol1
gluster volume info

Create a glusterfs mount point to connect to on each server
mkdir -p /mnt/gv01
Add mount to fstab (be sure to specify the appropriate server name)
echo “fscluster1.riverview.local:datavol1 /mnt/gv01 glusterfs defaults 0 0” >> /etc/fstab
mount -a

Install glusterfs on a client (I am using Ubuntu 16.04)
sudo apt-get install glusterfs-client

Create a mount point on you client
mkdir /mnt/fscluster

mount the gluster file system
sudo mount -t glusterfs fscluster.domainname.local:/datavol1 /mnt/fscluster

Other information:

Possible Connection Issues
gluster peer status – Will show what servers are part of the cluster and their connection status
gluster volume info – Will show what servers and bricks are connected in the cluster. It will also display the clusters brick configuration.
A four server Distributed/Replicated Example:
____________________________________
Volume Name: volume1
Type: Distributed-Replicate
Volume ID: 6689132a-6221-4fee-ba6d-892b7d0fc7f5
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: fscluster1.domainname.local:/data/brick1/gv0
Brick2: fscluster2.domainname.local:/data/brick1/gv0
Brick3: fscluster3.domainname.local:/data/brick1/gv0
Brick4: fscluster4.domainname.local:/data/brick1/gv0
Options Reconfigured:
performance.readdir-ahead: on
____________________________________
netstat -apt | grep glusterfsdThe output of this command should show each of your connected servers. If not restarted glusterfsd on each server that is not showing up and try the command again

To delete a data volume
gluster volume stop datavol1
gluster volume delete datavol1

If you have problems with creating a volume sometime it is do to a connection issue. Troubleshoot with telnet.
yum install telnet
telnet (name or ip of server) 24007

If a data volume is not created right and reminant remain you can use the following command to clean things up
(Possible error when recreating a volume: Staging failed on fscluster2.domainname.local. Error: /data/brick1/gv0 is already part of a volume)
Do the follow:
setfattr -x trusted.glusterfs.volume-id /$pathtobrick
setfattr -x trusted.gfid /$pathtobrick
rm -rf /$pathtobrick/.glusterfs
Example:
________________________________________
setfattr -x trusted.glusterfs.volume-id /data/brick1/gv0
setfattr -x trusted.gfid /data/brick1/gv0
rm -rf /data/brick1/.glusterfs
________________________________________

By: nighthawk

Category: Linux | Comments Off

April 21

Linux: Local domain addresses are not resolving in Ubuntu

In Ubuntu 14.04 local domain addresses are not resolving properly. nslookup and dig commands work. This is because they use a different process to query DNS information. To fix the issue you need to modify /etc/nsswitch.conf

Replace:
hosts:          files mdns4_minimal [NOTFOUND=return] dns mdns4

with:
hosts: files dns

By doep, dragouf, and nighthawk

Category: Linux | Comments Off

April 13

Linux: How to Automate Docker Deployments

How to Automate Docker Deployments

11th November 2014

For months I’ve been using Drone to continuously build and deploy Docker applications. This has achieved the desired effect of drastically reducing the time required to get code into production. But it wasn’t until recently that I finally got a handle on my deploy script, which was silently leaking old, untagged Docker images onto my hard drives. Below, I share a method for cleanly upgrading a running container, in the form of a bash script suitable for basic automated deployment setups.

Docker Images vs. Containers

First, let’s cover some important points about Docker that will help explain the script below. In Dockerland, there are images and there are containers. The two are closely related, but distinct. For me, grasping this dichotomy has clarified Docker immensely.

What’s an Image?

An image is an inert, immutable, file that’s essentially a snapshot of a container. Images are created with the build command, and they’ll produce a container when started with run. Images are stored in a Docker registry such as registry.hub.docker.com. Because they can become quite large, images are designed to be composed of layers of other images, allowing a miminal amount of data to be sent when transferring images over the network.

Local images can be listed by running docker images:

REPOSITORY                TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
ubuntu                    13.10               5e019ab7bf6d        2 months ago        180 MB
ubuntu                    14.04               99ec81b80c55        2 months ago        266 MB
ubuntu                    latest              99ec81b80c55        2 months ago        266 MB
ubuntu                    trusty              99ec81b80c55        2 months ago        266 MB
<none>                    <none>              4ab0d9120985        3 months ago        486.5 MB
...

Some things to note:

IMAGE ID is the first 12 characters of the true identifier for an image. You can create many tags of a given image, but their IDs will all be the same (as above).
VIRTUAL SIZE is virtual because its adding up the sizes of all the distinct underlying layers. This means that the sum of all the values in that column is probably much larger than the disk space used by all of those images.
The value in the REPOSITORY column comes from the -t flag of the docker build command, or from docker tag-ing an existing image. You’re free to tag images using a nomenclature that makes sense to you, but know that docker will use the tag as the registry location in a docker push or docker pull.
The full form of a tag is [REGISTRYHOST/][USERNAME/]NAME[:TAG]. For ubuntu above, REGISTRYHOST is inferred to be registry.hub.docker.com. So if you plan on storing your image called my-application in a registry at docker.example.com, you should tag that image docker.example.com/my-application.
The TAG column is just the [:TAG] part of the full tag. This is unfortunate terminology.
The latest tag is not magical, it’s simply the default tag when you don’t specify a tag.
You can have untagged images only identifiable by their IMAGE IDs. These will get the <none> TAG and REPOSITORY. It’s easy to forget about them.

More info on images is available from the Docker docs.

What’s a container?

To use a programming metaphor, if an image is a class, then a container is an instance of a class—a runtime object. Containers are hopefully why you’re using Docker; they’re lightweight and portable encapsulations of an environment in which to run applications.

View local running containers with docker ps:

CONTAINER ID        IMAGE                               COMMAND                CREATED             STATUS              PORTS                    NAMES
f2ff1af05450        samalba/docker-registry:latest      /bin/sh -c 'exec doc   4 months ago        Up 12 weeks         0.0.0.0:5000->5000/tcp   docker-registry

Here I’m running a dockerized version of the docker registry, so that I have a private place to store my images. Again, some things to note:

Like IMAGE ID, CONTAINER ID is the true identifier for the container. It has the same form, but it identifies a different kind of object.
docker ps only outputs running containers. You can view stopped containers with docker ps -a.
NAMES can be used to identify a started container via the --name flag.

How to avoid image and container buildup?

One of my early frustrations with Docker was the seemingly constant buildup of untagged images and stopped containers. On a handful of occassions this buildup resulted in maxed out hard drives slowing down my laptop or halting my automated build pipeline. Talk about “containers everywhere”!

We can remove all untagged images by combining docker rmi with the recent dangling=true query:

docker images -q --filter "dangling=true" | xargs docker rmi

Docker won’t be able to remove images that are behind existing containers, so you may have to remove stopped containers with docker rm first:

docker rm `docker ps --no-trunc -aq`

These are known pain points with Docker, and may be addressed in future releases. However, with a clear understanding of images and containers, these situations can be avoided with a couple of practices:

Always remove a useless, stopped container with docker rm [CONTAINER_ID].
Always remove the image behind a useless, stopped container with docker rmi [IMAGE_ID].

These are the practices built into the upgrade script below.

Deployment Script

The following script is what I use to upgrade a running Docker container to a newer version. This could go in the deploy part of a drone.yml:

#!/bin/bash
docker pull docker.example.com/my-application:latest
docker stop my-application
docker rm my-application
docker rmi docker.example.com/my-application:current
docker tag docker.example.com/my-application:latest docker.example.com/my-application:current
docker run -d --name my-application docker.example.com/my-application:latest

Let’s step through line by line.

1. Pull latest image

docker pull docker.example.com/my-application:latest

We assume that a more recent image has been built and pushed to a registry. If the image hasn’t been tagged, this will result in two new images in the docker images list: one with <none> tag, and one with latest. They’ll both share the same IMAGE ID. If there’s a previous version of this image tagged with latest, it will be untagged.

2. Stop the running container

docker stop my-application

Stop the running instance of my-application. This assumes that we gave the image that name last time it was run.

3. Remove stopped container

docker rm my-application

Now that the container is stopped, it’s safe to remove it. This is cleanup step #1. Also, without this step, we wouldn’t be able to give the name my-application to a different container.

4. Remove image behind stopped container

docker rmi docker.example.com/my-application:current

Now Docker will let us remove the image behind the container we just stopped and removed. This assumes we’ve previously tagged it with current. We need a tag other than latest to be able to differentiate between the version of the app we’re getting rid of, and the one we’re bringing in.

5. Tag the newly downloaded image

docker tag docker.example.com/my-application:latest docker.example.com/my-application:current

Now that the current tag is nonexistent, we tag the downloaded image with current, so that we can identify it next time around. Until step 1 of our next upgrade, the current and latest tags will have the same IMAGE ID.

6. Run the new container

docker run -d --name my-application docker.example.com/my-application:latest

Run the image, being sure to --name it my-application.

That’s it! A possible improvement would be to keep the previous version around to support a quick rollback script. I’m interested in hearing how others are doing automated Docker deployments, and wonder how tools like Kubernetes implement container upgrades under the hood.

By: Caleb Sotelo

Category: Linux | Comments Off