Discussion:
Caching in Fortress Core
Shawn McKinney
2018-11-16 12:18:19 UTC
Permalink
Many years ago I made a very bad decision that has haunted me ever since — caching.

That’s right, short-term gratification (performance) at the expense of long-term satisfaction.

The biggest drawback, now things like Fortress REST aren’t running entirely stateless, which prevents running in a load-balanced topology.

So now the thought is how do we drop the cache, or at least find a way to make it responsive to changes that occur in other nodes.

Before we make the decision to change, it’s probably best to identify all of the entities/relationships that will be affected.

1. OrgUnit entity validation
2. PwPolicy entity validation
3. AdminRole Hierarchies
4. Perm OU Hierarchies
5. Role Hierarchies
6. User OU Hierarchies
7. Dynamic Separation of Duty Constraints
8. Static Separation of Duty Constraints

1&2 are trivial, simply contain lists of entity names, that are validated when added as a relationships to other entity — users mostly) but also Perms.

The hierarchies are more about functionality. That is the entities that comprise a particular hierarchy, e.g. Roles, are stored flat, under a single ou, but their structure is contained within a Graph datastructure, that is stored in cache, and interpreted during runtime - i.e. is this role a parent of another roles.

Same is true for the OU hiers.

The SoD caches are the entire list of elements that comprise dynamic and static constraints, used during the validation, e.g. is can this role be assigned or activated.

I believe there are a couple of directions we can take.

a. Continue to use the cache, but now have a listener attached to the data persistence that notifies of change, and triggers a refresh.

b. Remove the cache entirely and just pay for the cost.

I’d say ‘a’ if the complexity and overhead isn’t great other b.

— Shawn
Stefan Seelmann
2018-11-16 19:56:06 UTC
Permalink
Post by Shawn McKinney
a. Continue to use the cache, but now have a listener attached to the data persistence that notifies of change, and triggers a refresh.
b. Remove the cache entirely and just pay for the cost.
I’d say ‘a’ if the complexity and overhead isn’t great other b.
a) means you have to notify the other Fortress servers about the change,
right? This can be done using some pubsub mechanism but that requires
another component. Or you you want to use LDAP persistent search? I
think both is not strong consisten if that is important.

Maybe another option c) is to use a distributed cache.If I see correctly
Fortress already uses ehcache which also provides a distributed cache,
Terracotta Server, but I'm not sure about the license. There are also
projects at the ASF like Ignite and Commons JCS. Otherwise there are
Memcached or Redis.
Emmanuel Lecharny
2018-11-16 22:20:32 UTC
Permalink
Post by Shawn McKinney
Post by Shawn McKinney
a. Continue to use the cache, but now have a listener attached to the
data persistence that notifies of change, and triggers a refresh.
Post by Shawn McKinney
b. Remove the cache entirely and just pay for the cost.
I’d say ‘a’ if the complexity and overhead isn’t great other b.
a) means you have to notify the other Fortress servers about the change,
right? This can be done using some pubsub mechanism but that requires
another component. Or you you want to use LDAP persistent search? I
think both is not strong consisten if that is important.
Persistent search is definitely an option, I think. Fortress will receive
all the updates coming from the LDAP server, and update the cache
accordingly. Now, that may not be that simple:
* consistency: not such a big deal, as soon as the cache is blocked when
it's updated. Also remember that there is no strong guarantee that what you
get from a LDAP server is up to date, so...
* transactions: if some update impact the cache while its data is being
used by a fortress client, you may be in trouble (ie, the client may be in
trouble). That is something requiring some deep thought.

Ideally speaking, Fortress cache could be based on MVCC, so that a fortress
client will *always* fetch a consistent version of the data, and that would
guarantee that an incoming update will not impact the clients. Moerever, it
would allow clients to access the cache without any locking issue.
Post by Shawn McKinney
Maybe another option c) is to use a distributed cache.If I see correctly
Fortress already uses ehcache which also provides a distributed cache,
Terracotta Server, but I'm not sure about the license. There are also
projects at the ASF like Ignite and Commons JCS. Otherwise there are
Memcached or Redis.
Probably overkilling :/

One other possibility: defer complex operation to the LDAP server, instead
of procssing it in Fortress. You will be closer to where the data are
stored, this will be consistent, and you won't need to manage a cache in
Fortress. The drowback is that you will be limited in the servers you can
use: only those implementing the operations you would defer. A good way to
implement that is trhough extended operation, or in ApacheDS, using
storedProcedures (but considering that the SP interceptor may need to be
fixed, and also extended to support extended operation, this is a mute
possibility, IMHO).
--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com
Shawn McKinney
2018-11-19 12:37:54 UTC
Permalink
* consistency: not such a big deal, as soon as the cache is blocked when it's updated. Also remember that there is no strong guarantee that what you get from a LDAP server is up to date, so...
* transactions: if some update impact the cache while its data is being used by a fortress client, you may be in trouble (ie, the client may be in trouble). That is something requiring some deep thought.
Ideally speaking, Fortress cache could be based on MVCC, so that a fortress client will *always* fetch a consistent version of the data, and that would guarantee that an incoming update will not impact the clients. Moerever, it would allow clients to access the cache without any locking issue.
I see the cache and transactions as two separate issues. Both have implications that extend beyond the other. For example transactions have import beyond caching and vice-versa.

For the record, I can see transactions in fortress’ future.

Also agree with idea of using a persistent search to notify of changes. Yes, there may still be rare cases where updates occurring in the client might not coincide with the latest data in the cache but, these are entities that aren’t updated frequently, so the risk is low. What can go wrong? An update will fail that otherwise should have succeeded.

My biggest concern is overhead. If every fortress client has a set of persistent searches (there’d be eight right?) what does this do to performance on either tier?
Maybe another option c) is to use a distributed cache.If I see correctly
Fortress already uses ehcache which also provides a distributed cache,
Terracotta Server, but I'm not sure about the license. There are also
projects at the ASF like Ignite and Commons JCS. Otherwise there are
Memcached or Redis.
The distributed cache was what I had in mind, from the beginning. However, the idea of having another server in place, just to manage these fairly trivial datasets doesn’t feel like an appropriate use today. One, its doubtful that this solution would be available to the community. Two, its worth outweighs the value.

—Shawn
Shawn McKinney
2018-11-19 12:40:48 UTC
Permalink
Post by Shawn McKinney
Two, its worth outweighs the value.
er, costs more than its worth.
Emmanuel Lecharny
2018-11-19 14:10:41 UTC
Permalink
Post by Emmanuel Lecharny
Persistent search is definitely an option, I think. Fortress will
receive all the updates coming from the LDAP server, and update the cache
Post by Emmanuel Lecharny
* consistency: not such a big deal, as soon as the cache is blocked when
it's updated. Also remember that there is no strong guarantee that what you
get from a LDAP server is up to date, so...
Post by Emmanuel Lecharny
* transactions: if some update impact the cache while its data is being
used by a fortress client, you may be in trouble (ie, the client may be in
trouble). That is something requiring some deep thought.
Post by Emmanuel Lecharny
Ideally speaking, Fortress cache could be based on MVCC, so that a
fortress client will *always* fetch a consistent version of the data, and
that would guarantee that an incoming update will not impact the clients.
Moerever, it would allow clients to access the cache without any locking
issue.
I see the cache and transactions as two separate issues. Both have
implications that extend beyond the other. For example transactions have
import beyond caching and vice-versa.
For the record, I can see transactions in fortress’ future.
Also agree with idea of using a persistent search to notify of changes.
Yes, there may still be rare cases where updates occurring in the client
might not coincide with the latest data in the cache but, these are
entities that aren’t updated frequently, so the risk is low. What can go
wrong? An update will fail that otherwise should have succeeded.
My biggest concern is overhead. If every fortress client has a set of
persistent searches (there’d be eight right?) what does this do to
performance on either tier?
No overhead. It's free. The client just get notified when some update
occurs. The APi is asynchronous anyway, worse case: you eat a connection
(though a thread) waiting for incoming updates.
Post by Emmanuel Lecharny
Maybe another option c) is to use a distributed cache.If I see correctly
Fortress already uses ehcache which also provides a distributed cache,
Terracotta Server, but I'm not sure about the license. There are also
projects at the ASF like Ignite and Commons JCS. Otherwise there are
Memcached or Redis.
The distributed cache was what I had in mind, from the beginning.
However, the idea of having another server in place, just to manage these
fairly trivial datasets doesn’t feel like an appropriate use today. One,
its doubtful that this solution would be available to the community. Two,
its worth outweighs the value.
The LDAP server *is* your distributed cache, somehow :-)

Cordialement,
Emmanuel Lécharny
www.iktek.com

Loading...