From vincent at nexedi.com Thu Dec 23 12:20:10 2010 From: vincent at nexedi.com (Vincent Pelletier) Date: Thu, 23 Dec 2010 11:20:10 +0000 Subject: [Neo-dev] Code status update & short roadmap In-Reply-To: <201012151320.19156.vincent@nexedi.com> References: <201012151320.19156.vincent@nexedi.com> Message-ID: <201012231120.10984.vincent@nexedi.com> Le mercredi 15 d?cembre 2010 13:20:19, Vincent Pelletier a ?crit : > Client code will be fixed so this test passes. I have working code on my maching for this, I need to split it in smaller patches (it's quite huge) if possible, and update NEO tests. > An iterator test fails on current code, which is because current iterator > implementation is too superficial. It is being completely reworked. Done in r2550. There is a notable recent change (r2564, some fixes are coming as we found bugs after committing), which reduced storage tpc_finish lock lifespan to increase storage operation parallelism (object-level read-lock taking & answering master). It should be its smallest lifespan, and is expected to reduce the amount of deadlocks encountered after the work on locks described in previous mail. For the record, those deadlocks happens in either of those scenarios: - competing transactions taking write lock on the same object on differently- sorted list of storage nodes (T1 locks on S1, T2 locks on S2, then T1 tries lock on S2 while T2 tries to lock on S1). This can be reduced somewhat, but there is no perfect solution known yet. - competing transactions taking write locks for a common subset of object, but in different order (T1 locks O1, T2 locks O2, then T1 tries to lock O2 while T2 tries to lock O2). Both cases cause lock timeout to be reached (30s by default) and at least one transaction gets rolled back (ConflictError is raised), freeing locks for the remaining transaction. This degrades performance significantly, and it's bad to rollback transactions. -- Vincent Pelletier From vincent at nexedi.com Thu Dec 23 12:20:12 2010 From: vincent at nexedi.com (Vincent Pelletier) Date: Thu, 23 Dec 2010 11:20:12 -0000 Subject: [Neo-dev] Code status update & short roadmap In-Reply-To: <201012151320.19156.vincent@nexedi.com> References: <201012151320.19156.vincent@nexedi.com> Message-ID: <201012231120.10984.vincent@nexedi.com> Le mercredi 15 d?cembre 2010 13:20:19, Vincent Pelletier a ?crit : > Client code will be fixed so this test passes. I have working code on my maching for this, I need to split it in smaller patches (it's quite huge) if possible, and update NEO tests. > An iterator test fails on current code, which is because current iterator > implementation is too superficial. It is being completely reworked. Done in r2550. There is a notable recent change (r2564, some fixes are coming as we found bugs after committing), which reduced storage tpc_finish lock lifespan to increase storage operation parallelism (object-level read-lock taking & answering master). It should be its smallest lifespan, and is expected to reduce the amount of deadlocks encountered after the work on locks described in previous mail. For the record, those deadlocks happens in either of those scenarios: - competing transactions taking write lock on the same object on differently- sorted list of storage nodes (T1 locks on S1, T2 locks on S2, then T1 tries lock on S2 while T2 tries to lock on S1). This can be reduced somewhat, but there is no perfect solution known yet. - competing transactions taking write locks for a common subset of object, but in different order (T1 locks O1, T2 locks O2, then T1 tries to lock O2 while T2 tries to lock O2). Both cases cause lock timeout to be reached (30s by default) and at least one transaction gets rolled back (ConflictError is raised), freeing locks for the remaining transaction. This degrades performance significantly, and it's bad to rollback transactions. -- Vincent Pelletier