Mimic
An API Compatible Mock Service For OpenStack
Lekha
software developer in test
(at) Rackspace
github: | lekhajee |
freenode irc: | lekha |
twitter: | @lekha_j1 |
Glyph
software developer
(at) Rackspace
github: | glyph |
freenode irc: | glyph |
twitter: | @glyph |
What?
Who?
Sneak Peek
Why?
Rackspace Auto Scale
An open source project
Source: https://github.com/rackerlabs/otter
Dependencies
(of Auto Scale)
Rackspace
Identity
Rackspace
Cloud Servers
Rackspace
Cloud Load Balancers
Rackspace Identity
is API compatible with
Openstack Identity v2
Rackspace Cloud Servers
is powered by
Openstack Compute
Rackspace Cloud Load Balancers
is a
Custom API
Testing
(for Auto Scale)
Lekha: As Auto Scale was interacting with so many other systems,
testing Auto Scale did not mean just testing features of
Auto Scale. But also that, if any of these systems it depended on
did not behave as expected, Auto Scale did not crumble and crash,
but was consistent and able to handle these upstream failures
gracefully.
So, there were two kinds tests for Auto Scale:
Functional
API contracts
Lekha: One, was the Functional tests to validate the API
contracts. These tests verified the responses of the Auto Scale
API calls given valid, or malformed requests.
System Integration
|
↱
|
Identity
|
Auto Scale
|
→
|
Compute
|
|
↳
|
Load Balancers
|
Lekha: And the other was the System integration tests. Theese,
were more complex. These tests verified integration between Auto
Scale and Identity, Compute, and Load balancers.
System Integration
Success
Failure
Lekha: For example: When a scaling group was created, one such
test will verify that the servers on that group were provisioned
successfully. That is, that the servers went into an 'active'
state and were then added as a nodes to the load balancer on the
scaling group.(DOUBLE CLICK)
Or, if a server went into an error state, (yes! that can
happen!), Auto Scale was able to re-provision that server
successfully, and then add that active server to the load balancer
on the scaling group.
Testing Problems
Test run time ≥ server build time
Lekha: All these tests were set up to run against the real
services. And... here are some observations I had whilst writing
the tests: (CLICK)
Servers could take over a minute, or ten minutes, or
longer to provision. And the tests would run that-much-longer.
BUILD → ACTIVE ERROR ACTIVE
Lekha: Sometimes, the tests would fail! due to raandom upstream failures.
Like a test would expect a building server to go into an 'active' state,
but it would (CLICK) go into an ERROR state
unknown errors
Lekha: And tests for such negative scenarios, like actually testing how
Auto Scale would behave if the server did go into 'error' state,
could not be tested. This is something that could not be
reproduced consistently.
However...
Improving test coverage
Tests → gate
Lekha: However, (CLICK)the overall test coverage was improving. And I
continued to add tests, oblivious of the time it was taking run
the entire test suite!
Later, (CLICK) we had started using these tests as a gate, in the
Autoscale merge pipeline.
And...
Slow, flaky tests
Unhappy peers
Lekha: And, (CLICK) the tests were running for so long and were sometimes flaky.
Nobody dared to run these tests locally! Not even me, when I was
adding more tests! (CLICK)
Also, our peers from the compute and load balancers teams, whose
resources we were using up for our "Auto-scale" testing, were
not happy! So much, so that, we were pretty glad, we were in a
remote office!
We've Had Enough!
(on Auto Scale)
Lekha: But! We had had enough! This had to change! we needed
something! to save us from these slow flaky tests!
There And Back Again
Specific | → | General |
Auto Scale | | Mimic |
General | → |
Specific |
Mocking Failure | | Mimic Mimicking OpenStack |
Glyph: Now that we've had enough, how are we going to solve this
problem? (CLICK) Since we've been proceeding from the specific
case of Auto Scale to the general utility of Mimic, (click) let's
go back to the general problem of testing for failure, and proceed
to the specific benefits that Mimic provides.
General →
Negative Path Testing
Glyph: Whenever you have code that handles failures, (CLICK) you
need to have tests to ensure that that code works properly.
Real Services
FAIL
Glyph: And if you have code that talks to external services, those
services are going to fail, and you're going to need to write code
to handle that.
But Not When You
WANT THEM TO
Glyph: But if your only integration tests are against real versions
of those external services, then only your unit tests are going to
give you any idea of whether you have handled those failure cases
correctly.
Succeeding
At Success
Glyph: Your positive-path code - the code that submits a request
and gets the response that it expects - is going to get lots of
testing in the real world. Services usually work, and when they
don't, the whole business of service providers is to fix it so they
do. So most likely, the positive-path code is going to get
exercised all the time and you will have plenty of opportunities to
flush out bugs.
Means Failing
At Failure
Glyph: If you test against real services, your negative-path code
will only get invoked in production when there's a real error. If
everything is going as planned, this should be infrequent, which is
great for your real service but terrible for your test coverage.
Mimic Succeeds
At Failure!
Glyph: It's really important to get negative-path code right. If
all the external services you rely on are working fine,
then it's probably okay if your code has a couple of bugs.
You might be able to manually work around them.
😈 ☁
(Production)
Glyph: But if things are starting to fail with some regularity in
your cloud - that is to say - if you are using a cloud - that is
exactly the time you want to make sure your system is
behaving correctly: accurately reporting the errors, measuring the
statistics on those errors, and allowing you to stay on top of
incident management for your service and your cloud.
😇 ☁
(Staging)
Glyph: Even worse, when you test against a real service, you are
probably testing against a staging instance. And, if your staging
instance is typical, it probably doesn't have as much hardware, or
as many concurrent users, as your production environment. Every
additional piece of harware or concurrent user is another
opportunity for failure, so that means your staging environment is
even less likely to fail.
import unittest
Glyph: I remember the bad old days of the 1990s when most projects
didn't have any unit tests. Things are better than that now.
OpenStack itself has great test coverage. We have unit tests for
individual unit tests and integration tests for testing real
components together.
test_stuff ... [OK]
Glyph: We all know that when you have code like this:
try:
result = service_request()
except:
return error
else:
return ok(result)
Glyph: ... that we need to write tests for this part:
try:
result = service_request()
except:
return error
else:
return ok(result)
Glyph: ... and one popular way to get test coverage for those error
lines is by writing a custom mock for it in your unit tests.
Glyph: So if we can't trust real systems for error conditions, why
isn't it sufficient to simply trust your unit tests to cover error
conditions, and have your integration tests for making sure that
things work in a more realistic scenario?
For those of you who don't recognize it, this is the Mock Turtle
from Alice in Wonderland. As you can see, he's not quite the
same as a real turtle, just like your test mocks aren't quite the
same as a real system.
if not os.chdir(ca_folder(project_id)):
raise exception.ProjectNotFound(
project_id=project_id)
Glyph: In June of this year, OpenStack
Compute introduced
a bug making it impossible to revoke a certificate. The
lines of code at fault were these two additions here.
This is not a criticism of Nova itself;
the
bug has already been fixed. My point is that they fell into
a very common trap.
if not os.chdir(ca_folder(project_id)):
raise exception.ProjectNotFound(
project_id=project_id)
Glyph: The bug here is that chdir
does not actually
return a value.
@mock.patch.object(os, 'chdir', return_value=True)
def test_revoke_cert_process_execution_error(self):
"..."
@mock.patch.object(os, 'chdir', return_value=False)
def test_revoke_cert_project_not_found_chdir_fails(self):
"..."
Glyph: Because the unit tests introduced with that change construct
their own mocks for chdir
, Nova's unit tests properly
cover all the code, but the code is not integrated with a system
that is verified in any way against what the real system (in this
case, Python's chdir) does.
Glyph: In this specific case, Nova might have simply
tested against a real directory structure in the file system,
because relative to the value of testing against a real
implementation, "New Folder" is not a terribly expensive operation.
Glyph: However, standing up an OpenStack cloud is significantly
more work than running mkdir
. If you are developing
an application against OpenStack, deploying a real cloud to test
against can be expensive, error-prone, and slow, as Auto Scale's
experience shows.
The Best Of
Both Worlds?
Creating a one-off mock for every test is quick, but error prone.
Good mocks rapidily become a significant maintainance burden in
their own right. Auto Scale needed something that could produce
all possible behaviors like a unit test mock, but ensure those
behaviors accurately reflected a production environment. (CLICK) It
should be something maintained as a separate project, not part of a
test suite, that can have its own tests and code review to ensure
its behavior is accurate.
→Specific
Mimic
Glyph: Since we've been proceeding from the general to the
specific, (CLICK) right here, where we need a realistic mock of a
back-end openstack service, is where the specific value of Mimic
comes in.
Mimic
Version 0.0
Lekha: The first version of Mimic was built as a stand-in service
for Identity, Compute and Rackspace Load balancers, the services
that Auto Scale depends on.
Pretending
...
Lekha: The essence of Mimic is pretending. The first thing that you
must do to interact with it is to...
Pretending
to authenticate
Lekha: ...pretend to authenticate.
Mimic does not validate credentials - all authentications will
succeed. As with the real Identity endpoint, Mimic's identity
endpoint has a service catalog which includes endpoints for all the
services implemented within Mimic.
A well behaved OpenStack client will use the service catalog to
look up URLs for its service endpoints. Such a client will only
need two pieces of configuration to begin communicating with the
cloud, i.e. credentials and the identity endpoint. A client
written this way will only need to change the Identity endpoint to
be that of Mimic.
Pretending
to Boot Servers
Lekha: When you ask Mimic to create a server, it pretends to create
one. This is not like stubbing with static responses: when Mimic
pretends to build a server, it remembers the information about that
server and will tell you about it in the subsequent requests.
Pretending
is faster
Lekha: Mimic was originally created to speed things up. So, it was
very important that - it be fast both to respond to requests, and to
have developers setup.
in-memory
Lekha: It uses in-memory data structures.
minimal
dependencies
(almost entirely pure Python)
Lekha: with minimal software dependencies, almost entirely pure Python.
Service Dependencies
Lekha: With no service dependencies
Configuration
Lekha: and no configuration
self-contained
Lekha: And is entirely self-contained.
Demo!
Nova command-line client
Lekha: Lets see how we can run the python nova command-line client against Mimic
config.sh
export OS_USERNAME=username
export OS_PASSWORD=password
export OS_TENANT_NAME=11111
export OS_AUTH_URL=http://localhost:8900/identity/v2.0/tokens
Lekha: Here is the config file that holds
the environment variables required for the OpenStack
command-line clients.
config.sh
export OS_USERNAME=username
export OS_PASSWORD=password
export OS_TENANT_NAME=11111
export OS_AUTH_URL=http://localhost:8900/identity/v2.0/tokens
Lekha: We have set a random username, password
and tenant name, as Mimic only pretends to authenticate
config.sh
export OS_USERNAME=username
export OS_PASSWORD=password
export OS_TENANT_NAME=11111
export OS_AUTH_URL=http://localhost:8900/identity/v2.0/tokens
Lekha: And the Auth url is set to be that of Mimic.
Now, let's continue where we left off with our first demo. So we
already have an instance of mimic running.
Using Mimic
(on Auto Scale)
Lekha: We did the same thing with Auto Scale. We pointed the tests
and the Autoscale API to run against an instance of Mimic.
The Results!
(Functional tests using Mimic)
Lekha: This reduced the test time exponentially!
Before Mimic the functional tests would take...
Functional Tests:
15 minutes
against a real system
vs.
30 seconds
against Mimic
Lekha: (CLICK) 15 minutes to complete, and now they run in (CLICK)less than 30
seconds!
The Results!
(Integration tests using Mimic)
Lekha: In the system integration tests, if one of the servers in
the test remained in "building" for fifteen minutes longer than
usual, then the tests would run fifteen minutes slower.
Integration Tests:
3 hours or more
against a real system
vs.
3 minutes
against Mimic
Lekha: These tests took (CLICK)over 3 hours to complete and using
Mimic this went down to be (CLICK) less than 3 *minutes*
consistently, to complete!
✈
Lekha: All our dev VMs are now configured to run against
Mimic.
One of our devs from the Rackspace Cloud Intelligence
team, calls this "Developing on Airplane Mode!", as we can work
offline without having to worry about uptimes of the upstream
systems and get immediate feedback on the code being written.
What about
negative paths?
Glyph: But Lekha, what about all the negative-path testing stuff I
was talking about before? Does Mimic simulate errors? How did
this dev VM test Auto Scale's error conditions?
Mimic does
simulate errors
Lekha: Well Glyph, I am as pleased as I am suprised that you ask
that. Mimic does simulate errors.
Error injection using metadata
Lekha: So, we had the one active server. Now, lets create a
server with the `metadata`: `"server_building": 30`. This will
keep the server in build state for 30 seconds. Now we have 2
servers. The active and building sever. Also, We can create a server
that goes into an error state, using the `metadata`: `"server_error":
True`. As you can see, we now have 3 different servers, with 3
different states.
Retry On Errors
Lekha: For the purposes of Auto Scale it was important that we
have right number of servers on a scaling group, even if a number
of attempts to create one failed. We chose to use
metadata
for error injection so that requests with
injected errors could also be run against real services. For Auto
Scale, the expected end result is the same number of servers
created, irrespective of the number of failures. But this behavior
may also be useful to many other applications because retrying is
a common pattern for handling errors.
Mimic 0.0 was...
Too Limited
Lekha: (PAUSE)However, the first implementation of mimic had some flaws,
it was fairly Rackspace specific and only implemented the endpoints
of the services that Autoscale depends upon . And they were all
implemented as part of Mimic's core. It ran each service on a
different port, meaning that for N endpoints you would need not
just N port numbers, but N *consecutive* port numbers. It allowed
for testing error scenarios, but only using the metadata. This was
not useful for all cases, for example, for a control panel that
does not not allow the user to enter any metadata.
Mimic 0.0 was...
Single Region
Lekha: Mimic also did not implement multiple regions. It used
global variables for storing all state, which meant that it was
hard to add additional endpoints with different state in the same
running mimic instance.
Beyond Auto Scale:
Refactoring Mimic
Glyph: Mimic had an ambitious vision: to be a one-stop mock for all
OpenStack and Rackspace services that needed fast integration
testing. However, its architecture at the time severely limited
the ability of other teams to use it or contribute to it. As Lekha
mentioned, it was specific not only to Rackspace but to Auto Scale.
YAGNI
Glyph: On balance, Mimic was also extremely simple. It followed
the You Aren't Gonna Need It principle of extreme programming very
well, and implemented just the bare minimum to satisfy its
requirements, so there wasn't a whole lot of terrible code to throw
out or much unnecessary complexity to eliminate.
E(ITO)YAGNI
Glyph: There is, however, a corrolary to YAGNI, which is
E(ITO)YAGNI: Eventually, It Turns Out, You *Are* Going To Need It.
As Mimic grew, other services within Rackspace wanted to make use
of its functionality, and a couple of JSON response dictionaries in
global variables were not going to cut it any more.
Plugins!
Glyph: So we created a plugin architecture.
Identity
Is the Entry Point
(Not A Plugin)
Glyph: Mimic's Identity endpoint is the top-level entry point to
Mimic as a service. Every other URL to a mock is available from
within the service catalog. As we were designing the plugin API,
it was clear that this top-level Identity endpoint needed to be the
core part of Mimic, and plug-ins would each add an entry for
themselves to the service catalog.
http://localhost:8900/mimicking/ NovaApi-78bc54/ORD/ v2/tenant_id_f15c1028/servers
Glyph: URLs within Mimic's service catalog all look similar. In
order to prevent conflicts between plugins, Mimic's core one
encodes the name of your plugin and the region name specified by
your plugin's endpoint. Here we can see what a URL for the Compute
mock looks like. (CLICK) This portion of the URL, which identifies
which mock is being referenced, is handled by Mimic itself, so that
it's always addressing the right plugin. (CLICK) Then there's the
part of the URL that your plugin itself handles, which identifies
the tenant and the endpoint within your API.
Plugin Interface:
“API Mock”
Glyph: Each plugin is an API mock, which has only two methods:
class YourAPIMock():
def catalog_entries(...)
def resource_for_region(...)
(that's it!)
Glyph: (click) catalog_entries
(click) and resource_for_region
(click) That's it!.
def catalog_entries(self,
tenant_id):
Glyph: catalog_entries
takes a tenant ID and returns the
entries in Mimic's service catalog for that particular API mock.
APIs have catalog entries for each API type, which in turn have
endpoints for each virtual region they represent.
return [
Entry(
tenant_id, "compute", "cloudServersOpenStack",
[
Endpoint(tenant_id, region="ORD",
endpoint_id=text_type(uuid4()),
prefix="v2"),
Endpoint(tenant_id, region="DFW",
endpoint_id=text_type(uuid4()),
prefix="v2")
]
)
]
Glyph: This takes the form of an iterable of a class called
(CLICK) Entry
, each of which is (CLICK) a tenant ID,
(CLICK) a type, (CLICK) a name, (CLICK) and a collection of
(CLICK) Endpoint
objects, each (CLICK) containing (CLICK)
the name of a pretend region, (CLICK) a URI version prefix that
should appear in the service catalog after the generated service
URL but before the tenant ID.
def resource_for_region(
self, region, uri_prefix,
session_store
):
return (YourRegion(...)
.app.resource())
Glyph: resource_for_region
takes (CLICK) the name of a
region, (CLICK) a URI prefix - produced by Mimic core to make URI
for each service unique, so you can generate URLs to your services
in any responses which need them - (CLICK) and a session store
where the API mock may look up state of the resources it pretended
to provision for the respective
tenants. (CLICK) resource_for_region
returns an HTTP resource
associated with the top level of the given region. This resource
then routes this request to any tenant- specific resources
associated with the full URL path.
class YourRegion():
app = MimicApp()
@app.route('/v2/<string:tenant_id>/servers',
methods=['GET'])
def list_servers(self, request, tenant_id):
return json.dumps({"servers": []})
Glyph: Once you've created a resource for your region, it has a
route for the parts of the URI that starts at the end of the URI
path. Here you can see what the nova "list servers" endpoint would
look like using Mimic's API; as you can see, it's not a lot of work
at all to return a canned response. It would be a little beyond
the scope of this brief talk to do a full tutorial of how resource
traversal works in the web framework that Mimic uses, but hopefully
this slide - which is a fully working response - shows that it
is pretty easy to get started.
Tell Mimic
To Load It
Glyph: Now that we have most of a plugin written, let's get Mimic
to load it up.
# mimic/plugins/your_plugin.py
from your_api import YourAPIMock
the_mock_plugin = YourAPIMock()
Glyph: To register your plugin with Mimic, you just need to drop an
instance of it into any module of the mimic.plugins
package.
Mimic Remembers
(until you restart it)
Glyph: This, of course, just shows you how to create ephemeral,
static responses - but as Lekha said previously, Mimic doesn't just
create fake responses; it remembers - (CLICK) in memory - what
you've asked it to do.
session = session_store.session_for_tenant_id(tenant_id)
class YourMockData():
"..."
your_data = session.data_for_api(your_api_mock,
YourMockData)
Glyph: That "session_store" object passed to resource_for_region is
the place you can keep any relevant state. It gives you a
per-tenant session object, and then you can ask that session for
any mock-specific data you want to store for that tenant. All
session data is created on demand, so you pass in a callable which
will create your data if no data exists for that tentant/API pair.
session = session_store.session_for_tenant_id(tenant_id)
from mimic.plugins.other_mock import (other_api_mock,
OtherMockData)
other_data = session.data_for_api(other_api_mock,
OtherMockData)
Glyph: Note that you can pass other API mocks as well, so if you
want to inspect a tenant's session state for other services and
factor that into your responses, it's easy to do so. This pattern
of inspecting and manipulating a different mock's data can also be
used to create control planes for your plugins, so that one plugin
can tell another plugin how and when to fail by storing information
about the future expected failure on its session.
Errors As A Service
Glyph: We are still working on the first error-injection endpoint
that works this way, by having a second plugin tell the first what
its failures are, but this is an aspect of Mimic's development we
are really excited about, because that control plane API also
doubles as a memory of the unexpected, and even potentially
undocumented, ways in which the mocked service can fail.
Error Conditions Repository
Lekha: Anyone testing a product, will run into unexpected
errors. Thats why we test! But we dont know what we dont know, and
cant be prepared for this ahead of time right!
Discovering Errors
Against Real Services
Lekha: When we were running the Auto Scale tests against Compute,
we began to see some one-off errors. Like, when provisioning a
server, the test expected a server to go into a building state for
some time before it is active, but it would remain in building
state for over an hour or even would sometimes go into an error
state, after.
Record Those Errors
Within Mimic
Lekha: Auto Scale had to handle such scenarios gracefully and the
code was changed to do so. And Mimic provided a way to tests
this consistently.
Discover More Errors
Against Real Services
Lekha: However, like I said, we dont know what we dont know. We
were not anticipating on finding any other such errors. But, there were more!
And... this was slow process for us to uncover such errors, as we
tested against the real services.
Record Those Errors
For The Next Project
Lekha: And, we continued to add such errors to Mimic.
Now, Wont it be great if not every client that depended on
a service had to go through this same cycle. Not everyone
had to find all the possible error conditions in the service by
experience. And have to deal with them at the pace that they
occur.
Share A Repository
For Future Projects
Lekha: What if we had a repository of all such known errors, that
everyone contributes to. So the next person using the plugin can
use the existing ones, and ensure there application behaves
consistently irrespective of the errors. And be able add any new
ones to it.
Mimic Is A Repository
Lekha: Mimic is just that, a repository of all known responses
including the error responses.
Mimic Endpoint
/mimic/v1.0/presets
Lekha: Mimic has an endpoint `presets` that today lists all
the metadata related errors conditions that can be simulated using Mimic.
Control
Glyph: In addition to storing a repository of errors, Mimic allows
for finer control of behavior beyond simple success and error. You
can determine the behavior of a mimicked service in some detail.
Now & Later
Glyph: We're not just here today to talk about exactly what Mimic
offers right now, but where we'd like it to go. And in that spirit
I will discuss one feature that Mimic has for controlling behavior
today, and one which we would like to have in the future.
Now
Glyph: Appropriately enough, since I'm talking about things now and
things in the future, the behavior-control feature I'd like to talk
about that that Mimic has right now is the ability to control time.
now()
Glyph: That is to say: when you do something against Mimic that
will take some time, such as building a server, time does not
actually pass ... for the purposes of that operation.
/mimic/v1.1/tick
Glyph: Instead of simply waiting 10 seconds, you can hit this
second out-of-band endpoint, the "tick" endpoint ...
{
"amount": 1.0
}
Glyph: with a payload like this. It will tell you that time has
passed, like so:
{
"advanced": 1.0,
"now": "1970-01-01T00:00:01.000000Z"
}
Glyph: Now, you may notice there's something a little funny about
that timestamp - it's suspiciously close to midnight, january
first, 1970. Mimic begins each subsequent restart thinking it's
1970, at the unix epoch; if you want to advance the clock, just
plug in the number of seconds since the epoch as the "amount" and
your mimic will appear to catch up to real time.
{
"server": {
"status": "BUILD",
"updated": "1970-01-01T00:00:00.000000Z",
"OS-EXT-STS:task_state": null,
"user_id": "170454",
"addresses": {},
"...": "..."
}
}
Glyph: If you've previously created a server with "server_building"
metadata that tells it to build for some number of seconds, and you
hit the 'tick' endpoint telling it to advance time the
server_building number of seconds...
{
"server": {
"status": "ACTIVE",
"updated": "1970-01-01T00:00:01.000000Z",
"OS-EXT-STS:task_state": null,
"user_id": "170454",
"addresses": {},
"...": "..."
}
}
Glyph: that server (and any others) will now show up as "active",
as it should. This means you can set up very long timeouts, and
have servers behave "realistically", but in a way where you can
test several hours of timeouts a time.
--realtime
Glyph: You can ask Mimic to actually pay attention to the real
clock with the --realtime
command-line option; that
disables this time-advancing endpoint, but it will allow any test
suites that rely on real time passing to keep running.
Later
Glyph: Another feature that isn't implemented yet, that we hope to
design later, is the ability to inject errors ahead of time, using
a separate control-plane interface which is not part of a mock's
endpoint.
Error Injection
Glyph: We've begun work on a branch doing this for Compute, but we
feel that every service should have the ability to inject arbitrary
errors.
Error Injection
Currently: Metadata-Based
Glyph: As Lekha explained, Mimic can already inject some errors by
supplying metadata within a request itself.
Error Injection
Currently: In-Band
Glyph: However, this means that in order to cause an error to
happen, you need to modify the request that you're making to mimic,
which means your application isn't entirely unmodified.
Error Injection
Future: Separate Catalog Entry
Glyph: What we'd like to do in the future is to put the
error-injection control plane into the service catalog, with a
special entry type so that your testing infrastructure can talk to
it.
Error Injection
Future: Out-Of-Band
Glyph: This way, your testing tool would authenticate to mimic, and
tell Mimic to cause certain upcoming requests to succeed or fail
before the system that you're testing even communicates with
it. Your system would not need to relay any expected-failure data
itself, and so no metadata would need to be passed through.
Error Injection
Future: With Your Help
Glyph: What we'd really like to build with these out-of-band
failures, though, is not just a single feature, but an API that
allows people developing applications against openstack to make
those applications as robust as possible by easily determining how
they will react at scale, under load, and under stress, even if
they've never experienced those conditions. So we need you to
contribute the errors and behaviors that you have
experienced.
Even Later...
Glyph: Mimic is based on a networking framework ...
Glyph: ... some of you know which one I'm talking about ...
Even Later...
Future Possibilities,
Crazy Features!
Glyph: ... which has such features as built-in DNS and SSH servers.
Even Later...
Real SSH Server
For Fake Servers
Glyph: It would be really cool if when a virtual server was booted,
the advertised SSH port really did give you access to an SSH
server, albeit one that can be cheaply created from a local shell
as a restricted user or a container deployment, not a real virtual
machine.
Even Later...
Real DNS Server
For Fake Zones
Glyph: Similarly, if we were to have a Desginate mock, it would be
really cool to have real DNS entries.
Mimic for OpenStack
Lekha: Mimic, can be the tool, where you do not have to stand up
the entire dev stack to understand how an OpenStack API behaves.
Mimic can be the tool which enables an OpenStack developer to get
quick feedback on the code he/she is writing and not have to go
through the gate multiple times to understand that - "maybe I
should have handled that one error, that the upstream system
decides to throw my way every now and then"
It's Easy!
Glyph: One of the things that I like to point out is that Mimic is
not real software. It's tiny, self-contained, doesn't need to
interact with a database, or any external services. Since it
mimics exclusively existing APIs, there are very few design
decisions. As a result, contributing to Mimic is a lot easier than
contributing to OpenStack proper.
We need your help!
Lekha: So, please come join us build Mimic. Together we can make
this a repository for all known reponses (including errors!) for the
OpenStack APIs.
As we mentioned earlier, Mimic is Open source and here is the
github link to the repository.
All the features or issues we are working on, or planning to work
on, in the near future are under the issues tab on github.
You can start by using Mimic and giving us your feedback. Or better
yet, forking it and contributing to it, by adding plugins for
services that do not exist today!
Thank you!