Friday, May 16, 2014

Faking Out Ceph-Deploy in OpenStack

I wanted to build a functional Ceph deployment for testing but did not have hardware to use.   So I decided I would use my instances in OpenStack.   The image choice I used for this configuration was the stock RHEL 6.5 cloud image from Redhat.   However when I went to do a ceph-deploy install on my monitor server, I ran into this:

[root@ceph-mon ceph]# ceph-deploy install ceph-mon
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.2): /usr/bin/ceph-deploy install ceph-mon
[ceph_deploy.install][DEBUG ] Installing stable version firefly on cluster ceph hosts ceph-mon
[ceph_deploy.install][DEBUG ] Detecting platform for host ceph-mon ...
[ceph-mon][DEBUG ] connected to host: ceph-mon
[ceph-mon][DEBUG ] detect platform information from remote host
[ceph_deploy][ERROR ] UnsupportedPlatform: Platform is not supported:

It didn;t really say what platform it thought this was that was unsupported, but I knew that Redhat 6.5 was supported so it really did not make any sense.   What I discovered though was that the following file was missing within my cloud image:

/etc/redhat-release

So I manually add it:

vi /etc/redhat-release
Red Hat Enterprise Linux Server release 6.5 (Santiago)

Then when I reran ceph-deploy it detected a supported platform:
 
[root@ceph-mon ceph]# ceph-deploy install ceph-mon
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.2): /usr/bin/ceph-deploy install ceph-mon
[ceph_deploy.install][DEBUG ] Installing stable version firefly on cluster ceph hosts ceph-mon
[ceph_deploy.install][DEBUG ] Detecting platform for host ceph-mon ...
[ceph-mon][DEBUG ] connected to host: ceph-mon
[ceph-mon][DEBUG ] detect platform information from remote host
[ceph-mon][DEBUG ] detect machine type
[ceph_deploy.install][INFO  ] Distro info: Red Hat Enterprise Linux Server 6.5 Santiago
[ceph-mon][INFO  ] installing ceph on ceph-mon
[ceph-mon][INFO  ] Running command: yum clean all


 

Cleaning Up Expired Tokens in OpenStack Keystone

Keystone is an OpenStack project that provides Identity, Token, Catalog and Policy services for use specifically by projects in the OpenStack family.  When a client obtains a token from Keystone, that token has a validity period before it expires.  However even after it is marked expired, it is kept in the MySQL database of OpenStack.  This can create issues if your environment is passing out a lot of tokens and can cause the token table to grow.

To prevent this infinite growth, you can implement the following command in a cron to clean up the expired tokens within the MySQL DB:

keystone-manage token-flush

Thursday, May 15, 2014

OpenStack Cinder: VNX Volume Cleanup

I recently had an issue where I had multiple Cinder volumes in OpenStack (Havana) that were stuck in a deleting state.   Unfortunately the trick of trying to put them back into a state of Available and trying to delete again did not work.    However I was able to come up with a solution to get the job completed and restore consistency.

In my example, my Cinder volumes were being provisioned on a EMC VNX.   So the first step I needed to do was validate if the volumes themselves were removed from the VNX. 

Cleanup on VNX:

1) Obtain the volume ID from either OpenStack Dashboard and/or CLI.
2) Log into Unisphere on the VNX that contains the volume pool where the volumes/luns for Cinder are being provisioned.
3) Select the volume pool and show the luns associated with the volume pool.
4) Filter on the luns using the volume ID obtained in step one.
5) Delete the lun.

Now that we have removed the reference on the VNX, we can continue to do the cleanup on the OpenStack side within the database itself.  This involves editing 3 tables in the cinder mysql database.

1)Obtain the volume ID from either OpenStack Dashboard and/or CLI.   Make note of the volume size as well.   You will also need to obtain the project/tenant ID that the volume belongs to.
2) Login to the OpenStack management controller that runs the MySQL DB.
3) Run the mysql command to access mysql.  Note your deployment may require password and hostname.
4) Select the cinder database with the following syntax:

 mysql>use cinder;

5) Next check if the volume id resides in the volume_admin_metadata table:

mysql> select * from volume_admin_metadata where volume_id="";

6) Delete the volume id if it does:

 mysql> delete from volume_admin_metadata where volume_id="";

7) Next check if the volume id resides in the volumes table:

 mysql> select * from volumes where id="";

8) Delete the volume id if it does:

mysql> delete from volumes where id="";

9) Next update the quota_usages table and reduce the values for the quota_usages fields for that project.  First get a listing to see where things are at:

mysql> select * from quota_usages where project_id="";

10) Then execute the update.  Depending on your setup you will have to update multiple fields from the output in step 9.  In the example below since I was clearing out all volumes for a given project tenant I was able to get away with the following update:

mysql> update quota_usages set in_use='0' where project_id="";

However in cases where you are removing just one volume, you will need to specify the project_id and resource type in the WHERE clause of your MySQL syntax to match on the right in_use field.  Further, your in_use will be either number of GB's  minus volume removed GB's or number of volumes minus one volume removed.

Once I completed this, my system was back in sync and the volumes stuck in Deleting status were gone.

Sunday, May 11, 2014

OpenStack Havana - Recovering Services Account When Deleted

I was working with one of my colleagues who had accidentally deleted the services account with OpenStack.   Unfortunately if this happens, it tends to break your setup in a really big way.   After opening a case with Redhat whose OpenStack distribution we were using led to no results.  I managed to reverse engineer where the services account resided and reestablish it.  Here are the steps I did:

Symptoms:

1) In web gui user gets "Oops something went wrong!" when trying to login.   User can get get valid token at command line (keystone token-get) but authorization fails.
2) Openstack-status shows the following:

                == Glance images ==
                Request returned failure status.
                Invalid OpenStack Identity credentials.
                ==Nova managed services ==
                ERROR: Unauthorized (HTTP 401)
                == Nova networks ==
                ERROR: Unauthorized (HTTP 401)
                == Nova instance flavors ==
                ERROR: Unauthorized (HTTP 401)
                == Nova instances ==
                ERROR: Unauthorized (HTTP 401)

Resolution:

Create New Services Project:

Create new "services" project/tenant via CLI (keystone tenant-create).
Obtain new "services" project/tenant ID via CLI (keystone tenant-list).

Determine NEW_SERVICES_ID:

Determine old project/tenant id of services project by looking at following users (nova,glance,neutron,heat,cinder) default_project_id in the user table of keystone database.   There default_project_id should all be the same and was the ID of the previous services project that was removed.


Edit MySQL Database:

use keystone;
update user set default_project_id="NEW_SERVICES_ID" where default_project_id="OLD_SERVICES_ID";
use ovs_neutron;
update networks set tenant_id="NEW_SERVICES_ID" where tenant_id="OLD_SERVICES_ID";
update subnets set tenant_id="NEW_SERVICES_ID" where tenant_id="OLD_SERVICES_ID";
update securitygroups set tenant_id="NEW_SERVICES_ID" where tenant_id="OLD_SERVICES_ID";