Posted
over 12 years
ago
by
Rinat Abdullin
Past few weeks were extremely busy for Lokad in many areas. I guess, that's how you feel when a start-up starts getting off.
One of my personal priorities was focused on a new Lokad project called Data Platform. If you are interested in business
... [More]
details, there is a blog post about it with a nice infographic.
Data Platform is essentially is going to be:
methodology and guidance on bringing together "big" data in organization and making it easily consumable across the same organization. All this - at a fraction of the cost and complexity usually related to doing the same with outrageously expensive "enterprise" setup;
open source reference project demonstrating how to aggregate and process relatively complex domain data either in a local data-center or in windows azure cloud;
should you need this - consulting, teaching and support from Lokad and partners on both technological setup and on details of details of dealing with the business intelligence;
as you can guess, a lot of details and concepts will explained in greater detail in due time within this blog and BeingTheWorst.com podcast.
From the technical perspective there is absolutely nothing new or innovative in Data Platform. A lot of high-tech companies have been doing "big data" and "cloud computing" for years, starting from high-frequency trading and up to the large hadron collider. If you seen Greg's Event Store, Lokad.Cloud and Lokad.CQRS projects, you already know what to expect.
From the business perspective, situation is totally different. All this publicly available knowledge is as good for a vast majority of our customers as if it were developed and used on the dark side of the moon. It's too far and too hard to get people who can handle it.
As you can guess, IT enthusiasts would not normally feel excited about managing data coming from cash registers in a retail company. This slows down technological progress a lot in such companies and creates an opportunity for sales-oriented software and consulting companies. These companies jump right in, selling mediocre but expensive stuff and not really solving the problems.
Such situation is the reason why 20GB of sales history is actually considered to be big data in these companies.
That's what we are trying to break with DataPlatform. It's too painful for us to help customers with the business intelligence (our core competency), when we can't even get the data out of "expensive stuff" - it either breaks, slows everything down or requires literally months of effort to extract data.
With Data Platform we want to show how it is possible in some specific cases to replace "expensive stuff" setup with something much simpler and cheaper, while getting even better performance and reliability. For instance sometimes a 1000000 EUR cluster can be replaced with a few virtual machines on Windows Azure for reliable storage of terrabytes of data, while having decent throughput and dead-simple way to consume such data.
The idea behing DataPlatform is to make it extremely simple and cheap to give it a try (at least as a way to finally stop discarding valuable business history data). Think about dead-simple event store (even simpler than the one used in the latest Lokad.CQRS) and projections on top, designed to get the most out of Windows Azure and store gigabytes/terrabytes of messages.
That's a small and obvious step for a lot of people reading my blog, however for a lot of enterprise companies that's a huge leap. If it were to be made, then suddenly there is a huge number of opportunities for moving forward: enterprise-wide Domain-Driven Design, efficient development organization around CQRS models, continuous delivery of certain elements, inherent scalability for these reports, capabilities for real-time and occasional connectivity etc.
Plus Lokad can stop wasting time on data integration issues and actually focus on business analytics.
PS: If all works out, DataPlatform will be pluggable into Lokad.CQRS as another event storage engine.
[Less]
|
Posted
over 12 years
ago
by
Rinat Abdullin
First of all, I would like to give my apologies for being sometimes slow on responses to emails and other sorts of communication. Last weeks were throwing new challenges, dealing with which takes a little bit of time. I'll catch up.
In case you
... [More]
are interested, here are a few random ideas that have proven to be helpful in handling such situations. These ideas were reiterated so many times, that it's no longer possible to point to the original author.
Don't try to be perfect in your decisions. You are human and small errors are inevitable. Besides, every single decision is worthless by itself (just like any "brilliant" idea). Only through the continuous and careful application of effort, something worthy can be achieved.
Continuous chain of good decisions will beat perfect plan any day (simply because you can adapt and keep on going). Likewise, good execution can be more important than any great idea alone.
Delegate. No matter how smart and talented we consider ourself to be, alone we are not able to handle and achieve as much as a team. Hence it is our duty to ensure that incoming challenges are balanced against the entire team. Our purpose is not to keep everybody under 100% load (e.g. by assigning tasks to people who are less efficient in handling them), but to ensure maximum efficiency of the entire unit.
Focus. Keeping multiple projects in your head at once is likely to drive you insane, cause insomnia, burnout or do worse. So try breaking entire problem field of your division or company into separate contexts. They can be really diverse: starting from accounting, HR management and up to long-term tech RnD. More often than not, you will find that tackling a specific problem involves just a single context. So you can keep only one context in your head most of the time, switching between them, as your day moves forward.
Arrange all tasks within this context in a queue, putting most painful ones upfront. Don't be afraid to drop tasks or change their priorities, if environment changes.
If you are tired but don't have time to rest - switch contexts. If you are exhausted but don't have time to rest - stop complaining and find some time. Full personal burn-out is far more expensive than a little bit of rest.
Adapt to changes on the battlefield. You would be surprised by the amount of resources wasted on projects even after they are doomed. Wars are not won by the sheer force, unless you are USA (and even States encountered some issues while bringing democracy to countries which didn't have serious fire-power but had population willing to take benefit of every hill, forest and trick in the book).
By being willing to accept that our initial plan is imperfect and adapt, we can reduce risks, save resources and potentially leverage new opportunities. Think "Instagram" or "Apple".
People are the only reason things happen around, don't forget about it. Universe was empty without purpose till humanity showed up bringing purpose along (with a noticeable degree of chaos).
It might be tempting to forget about people and focus on a single technology, idea, concept or code. However, by doing that we risk missing the whole point and actually undermining our own efforts.
For example, writing complex code without unit tests or documentation is often perceived as a sign of outstanding hacker or even a guru. Sometimes this even is worth it. However, more often than not, other people will have to maintain such code for years. If it is the case, then it is egoistic not to think about them. Code can have stronger positive impact if we put putting additional effort in making it helpful and friendly. It is much harder to do that, than delivering egoistic "easy to write and hard to read" code. Result will have greater long-term impact, though.
Same principle applies to all the other things we do in our everyday lives. This is not even an altruism, but merely common sense and adequate long-term thinking.
And the last one - Don't stop and keep pushing.
PS: I don't claim to be following these quotes every single day. However, continuously trying to do that - helps.
[Less]
|
Posted
over 12 years
ago
It seems to me that as we grow, our pace of innovation continues to accelerate. We are currently short of somewhat of a frenzy. More clients means much more exposure to high priority problems in eCommerce and retail, which is our food for
... [More]
innovation.
The latest addition to our portfolio of Big Data Commerce solutions is a cloud based BIG DATA PLATFORM. It is a truly exciting vision that has been cast into concept and product: Make the capturing, storing and exploiting of all of your company's transactional data in a fast, reliable and agile data platform simple, efficient and low cost. Combine this with smart applications that exploit this data in order to make smarter, faster operative decisions that address specific problems in the company.
Couponing, inventory optimization, pricing, store assortment optimization and personalization of online and offline customer communication are all examples of what can be accomplished with such as system in an efficient and low cost manner. Customer satisfaction, rapid ROI and extreme profitability are the core of what makes us so excited. Enough said, we chose to use this announcement to try our luck on our very first.... INFOGRAPHIC.
Do you share the excitement of this vision? Like or hate our infographic? Please get in touch or post in the comments.
[Less]
|
Posted
over 12 years
ago
It seems to me that as we grow, our pace of innovation continues to accelerate. We are currently short of somewhat of a frenzy. More clients means much more exposure to high priority problems in eCommerce and retail, which is our food for
... [More]
innovation.
The latest addition to our portfolio of Big Data Commerce solutions is a cloud based BIG DATA PLATFORM. It is a truly exciting vision that has been cast into concept and product: Make the capturing, storing and exploiting of all of your company's transactional data in a fast, reliable and agile data platform simple, efficient and low cost. Combine this with smart applications that exploit this data in order to make smarter, faster operative decisions that address specific problems in the company.
Couponing, inventory optimization, pricing, store assortment optimization and personalization of online and offline customer communication are all examples of what can be accomplished with such as system in an efficient and low cost manner. Customer satisfaction, rapid ROI and extreme profitability are the core of what makes us so excited. Enough said, we chose to use this announcement to try our luck on our very first.... INFOGRAPHIC.
Do you share the excitement of this vision? Like or hate our infographic? Please get in touch or post in the comments.
[Less]
|
Posted
over 12 years
ago
by
Rinat Abdullin
How many times did you want to start a new project and implement some really exciting idea? I've been there multiple times myself. Most of my attempts failed in the very beginning, because I was trying to think or plan too much in advance:
How
... [More]
do I plan for future extensibility and adding new features?
What if I need to switch between databases - how do we abstract this away?
How do we scale out for 1000000 users?
I need some formal process around features and releases. If project becomes successful and grows, new team members should have no difficulty joining in.
These are the most sane of the imaginary requirements that used to come to my mind (more exotic ones included terms: "Neural networks", "Linux kernel", "ARM processor support" and "should make good expresso").
All this felt like something good, as if I were planning for every feature and problem in advance.
However, in practice this somehow used to turn simple and exciting projects into challenging sets of problems, that had to be solved all at once. Most of the times, these sets were so complicated, that I had to stare them in awe without any slightest idea of where to start and what to do next. This state is often called analysis paralysis (or the worst way of dreaming). As you might guess, almost all of such projects were dropped, while the other half failed later during careful planning and execution.
We can wish to be prepared for a lot of problems and features in advance. But do we really need for them to happen all at once? That's really hard to achieve.
Life is simple. You can't walk 1000 miles at once. There has to be the first step, and then the one that will come after.
There is an approach that helps to move forward with development in such situations (first time I heard it from Gregory Young). It can be really hard for developers, since we all are inherently perfectionists. Instead of trying to plan for the entire project in advance, we just take the smallest bite possible. You can call it a "prototype", "minimum viable product", "let's give it a try or "dirtiest and hackiest code I've ever written". This attempt will be fast and deal with the core idea. If it fails - it will fail fast; if it makes at least some sense - it will only get better from this point. We can focus on the most painful problem that makes this idea shine (it will be easy to prioritise) and solve it. Then the next one and the next.
The idea is just to start walking towards the goal, instead of burning yourself down by an attempt of 1000 mile jump (only to discover that you jumped in the wrong direction).
The approach becomes even more valuable, when there are multiple stakeholders involved in the project. It is much easier to arrive at collaborative analysis paralysis, when everybody keeps on throwing their dreams in: "we want this", "it should do this", "what if this happens?".
The most simple solution to the core problem provides team with a starting point for discussion and planning. It makes discussions more real, than juggling with wishes and fears in the abstract problem space. This approach also helps to prioritise further progress - you focus on the most painful thing first.
Life and projects can overwhelm with problems. Keep it simple and focus on making next most important step. Step by step, you can walk around the world.
[Less]
|
Posted
over 12 years
ago
by
Rinat Abdullin
I had a lot of failures in my past development experience. Most of them were caused by being completely obsessed by some cool technology or a trick. These things were so appealing that desire to use them became the central idea of an application.
... [More]
Among the failures I had in my development, these ones were caused by design obsessed with some technology:
Design driven by principles of UI composition and Flexibility, where you build ultimately flexible CRM system with any number of fields, queries and forms.
Inversion-of-control-container driven design, where you design a system by dropping a large pile of services, controllers and managers into the IoC container, and then letting it resolve a complex graph.
ORM-driven design, where you design your "business objects" and the rest of the system is wired almost automatically.
CQRS-driven design, where you take this principle as architectural guideline and end up with a complete mess of messages and views.
Lesson learned - if central idea of your design is about technology, then such system will become a slave of this technology. All advantages and limitations of such technology will eventually become forth and strike you really hard.
If you start your system design by assumption of using a certain framework, database or tool - you are already paying a tribute to this obsession. It is unavoidable to some extent, since we are limited by knowledge and capabilities of our development teams.
However, we can reduce bad side-effects by trying hard to focus on the idea that is worth becoming the center of your application. As you probably have guessed, this idea is about solving the real-world business problem you have at hand (granted that this problem is worth solving). Examples of such problems are:
helping business to optimize it's pricing strategies across hundreds of thousands of products to increase turn-over and reduce amount of inventory that is thrown away;
enabling a company to serve millions of its customers better by allowing behavioral analysis of each individual and suggesting healthier and cheaper products;
helping a hospital to serve it's patients better by providing more efficient ways to diagnose patients, schedule available resources or collaborate on information about treatments and medications.
Technologies, stacks and approaches are merely replaceable tools that help to support such solution (even if tech is as cool as cloud computing, event sourcing or $YourCurrentlyFavoriteTechnologyHere$). Pick them consciously and don't let them become the core idea behind design of your solution. Such obsessions are among the most expensive ones.
While designing systems we try to use all cool tech we love. Design obsession with solving business problems is better.
[Less]
|
Posted
over 12 years
ago
by
Rinat Abdullin
When you no longer need to worry about persistence of A+ES entities, their captured behaviours tend to get more complex and intricate. In order to deliver software reliably in such conditions we need non-fragile and expressive way to capture and
... [More]
verify these behaviours with tests, while avoiding any regressions.
A+ES stands for Aggregates with Event Sourcing. This topic is covered in great detail in episodes of BeingTheWorst podcast.
In other words we need to ensure that:
tests will not break as we change internal structure of aggregates;
test should be expressive to capture easily any complex behavior;
they should match mental model of aggregate design and be understandable even by junior developers.
One solution is to focus on specific use cases using “specifications” or “given-when-then” tests. Within such tests we establish that:
given certain events;
when a command is executed (our case);
then we expect some specific events to happen.
Primary difference between specification and normal unit test is that the former explicitly define and describe a use case in a structured manner, while the latter just executes code. Each A+ES specification can be executed as a unit test, while the reverse is not necessarily true.
Due to strong synergy with DDD and no coupling to internal structural representation of A+ES entity, these tests capture intent and are not affected by internal refactorings (something common to CRUD-based Aggregate implementations)
In C# you can express such test case as:
[Test]
public void with_multiple_entries_and_previous_balance()
{
Given(
Price.SetPrice("salescast", 50m.Eur()),
Price.SetPrice("forecast", 2m.Eur()),
new CustomerCreated(id, "Landor", CurrencyType.Eur, guid, Date(2001)),
new CustomerPaymentAdded(id, 1, 30m.Eur(), 30m.Eur(), "Prepaid", "magic", Date(2001)),
ClockWasSet(2011, 3, 2)
);
When(
new AddCustomerBill(id, bill, Date(2011, 2), Date(2011, 3), new[]
{
new CustomerBillEntry("salescast", 1),
new CustomerBillEntry("forecast", 2),
new CustomerBillEntry("forecast", 8)
})
);
Expect(
new CustomerBillChargeAdded(id, bill, Date(2011, 2), Date(2011, 3), new[]
{
new CustomerBillLine("salescast", "Test Product 'salescast'", 1, 50m.Eur()),
new CustomerBillLine("forecast", "Test Product 'forecast'", 10, 20m.Eur()),
}, 2, 70m.Eur(), -40m.Eur(), Date(2011, 3, 2))
);
}
Test above is based on Lokad's version of A+ES Testing syntax, which was pushed to the master branch of Lokad.CQRS Sample Project. Look for spec_syntax class there.
Please note, that these specifications test A+ES entities at the level of application services (they accept command messages instead of method calls). This means that any Domain Services (helper classes that are passed by application service down to aggregate method call) are handled by the application service as well.
In this case we can use test implementations of domain services, configuring them via special events. Such events would be generated by helper methods (e.g.: Price.SetPrice("salescast", 50m.Eur()) or ClockWasSet(2011, 3, 2)). This allows us to reduce test fragility and also gain implicit documentation capabilities.
Specifications as Living Documentation
There are a few more side benefits of using specifications for testing business behaviours. First of all, specifications can act as a living documentation, which is always up-to-date. For instance, rendered documentation for the specification above would look like:
Test: add customer bill
Specification: with multiple entries and previous balance
GIVEN:
1. Set price of salescast to 50 EUR
2. Set price of forecast to 2 EUR
3. Created customer Customer-7 Eur 'Landor' with key 29c516fb-bdaf-48f5-a83d-d1dca263fdb6...
4. Tx 1: payment 30 EUR 'Prepaid' (magic)
5. Test clock set to 2011-03-02
WHEN:
Add bill 1 from 2011-02-01 to 2011-03-01
salescast : 1
forecast : 2
forecast : 8
THEN:
1. Tx 2: charge for bill 1 from 2011-02-01 to 2011-03-01
Test Product 'salescast' (1 salescast): 50 EUR
Test Product 'forecast' (10 forecast): 20 EUR
Results: [Passed]
This can be achieved by merely overriding ToString() methods of event and command contract classes. Open source SimpleTesting sample can provide more details.
Detailed documentation of AR+ES behaviours that is defined in form of specifications, always stays up-to-date and in sync with the code changes.
Specifications as Design Tool
If we push this concept of living documentation further down the road, specifications can be used to communicate with business experts upon the use cases, using Ubiquituous Language and domain models. You can either express use cases in text as “Given-When-Then”, have junior developer code them as unit tests and then ask domain experts to implement functionality.
Additional practical usage scenarios for specifications include:
You can print out all specifications as a really thorough list of use-cases for signing off by project stakeholders.
Specifications can easily be visualized as diagrams and graphs. They could help in better understanding of your domain, finding non-tested or complicated spots and driving development in general.
For instance, such diagram could look like:
Hope, this helps. I plan to cover this topic in greater detail in upcoming episodes of BeingTheWorst podcast.
[Less]
|
Posted
over 12 years
ago
by
Rinat Abdullin
Over the last week I've been thinking about high-scale production setups for event-centric architectures. Something that can handle retail networks in realtime while providing cost-effective solution to deal with business amnesia. Obviously there is
... [More]
Greg's event store (to be released tomorrow), however having multiple deployment options is even better.
Here's a quick overview of implementing event store with Redis. Redis is an Erlang C key-value store with configurable reliability guarantees, master-slave replication and a diverse set of server-side storage primitives.
ServiceStack developers use Redis extensively for caching. They have even developed ServiceStack.Redis for C#
Using immediate persistence (fsync after each change) and eventual replication you can easily get thousands of commits per second on a simple machine. This is way less than specialized event store implementations, but could be good enough for a low-cost production deployment. Besides, you can speed things up by doing fsync after each second. See more benchmarks or check out series of articles on ES with Redis and scala.
Event Storage Primitives
We can use following primitives for event storage persistence:
Hash - provides fast O(1) get/set retrieval operations for individual events
List - can store associations of events to the individual streams (fast to add)
Store individual events in hash structure (allows O(1)) operations:
> HSET EventStore e1 Event1
Where:
EventStore - name of the hash to use for storing events (might as well be one store per riak DB)
e1 - sequentially incrementing commit id
Event1 - event data
You can get number of events in the store by
> HLEN EventStore
(integer) 8
In order to enumerate all events in a store, you simply ask Redis to return all hashes given their IDs, for example:
> HMGET EventStore e1 e2 e3 e4
1) "Event1"
2) "Event2"
3) "Event3"
4) "Event4"
Individual event streams are just lists which contain references to individual commit IDs. You can add event(s) to a stream by RPUSH. For instance, here we add events e2, e4, e7 to list customer-42
> RPUSH customer-42 e2 e4 e7
Version of an individual event stream is a length of corresponding list:
> LLEN customer-42
(integer) 3
In order to get list of commits that are associated with a given list:
> LRANGE customer-42 0 3
1) "e2"
2) "e4"
3) "e7"
In order to achieve fast performance and transactional guarantees, we can run each commit operation as server-side LUA script, which will:
Provide concurrent conflict detection
Push event data to hash
Associate event with a stream
Publishing and replays
Redis provides basic primitive for PUB/SUB. This means, that we can push event notification to zero or more subscribers immediately (in the same tx) or eventually:
> PUBLISH EventStore e1 e2
This means that in order for the projection host (or any event listener) to have the latest events we:
Get current version of event store: HLEN
Enumerate all events from 0 to length by HMGET
Subscribe to new events, if there were new events since we started replaying (or read the new batch otherwise): SUBSCRIBE
Additional side effects
First, since Redis is a key-value store, we can also persist within the same setup:
Aggregate snapshots
Projected views
Second, capability for message queues can be handy for load-balancing work commands between multiple servers.
Third, server-side capability for associating events with event streams (individual event stream is just a collection of pointers to event IDs) can be handy for event-sourced business processes.
[Less]
|
Posted
over 12 years
ago
by
Rinat Abdullin
Starting from this week I'll be open again for consulting missions. I don't have years of enterprise experience behind my back, but there are a trick or two learned so far. If you are interested, there are two options.
Lokad Consulting missions -
... [More]
remote or on-site (pretty much anywhere in the world). That's part of what Lokad does and could also include some research and development by the company. Get in touch with me or directly at [email protected], if you are interested.
There are some more details at lokad.com. They don't include pure design and development questions like CQRS/ES/DDD, but that's also assumed.
Some of my free hours now and then - remote only, limited in availability mostly to weekends and evenings. There is a lot that I want to do in the upcoming months (including polish of Being the Worst) and having some spare cash for additional resources would not hurt here. Get in touch with me directly for that. I can't promise a lot of hours here, so if you need more - go for Lokad option.
However, if you are a one-man shop or a small startup company, it might be better for you to simply stay tuned to BTW podcast and ask questions in Lokad group (I'm answering them, whenever there is a bit of free time). This might take some time, but it comes for free (which is really important for startups).
[Less]
|
Posted
over 12 years
ago
The 2012 Data Days are taking shape, and we are looking forward to participating in the panel of speakers. With Big Data Intelligence experience in both eCommerce and physical retail, we have been invited to add a unique perspective through the
... [More]
knowledge of two very different worlds in terms of big data availability and exploitation to an otherwise rather eCommerce focused event.
In four tracks the topics of Data, Relevance, Innovation and Privacy will be discussed. On the second day, the Data Pioneers start-up competition is looking for creative and innovative data business ideas. The program is currently being finalized.
Are you coming to the event? Please connect with us prior to the event or simply grab us on the day. We are looking forward to seeing you in Berlin! [Less]
|