*** ricardoamaro (~ricardoam@drupal.org/user/74228/view) has joined #wikid | 10:13 | |
*** ricardoamaro has quit (Ping timeout: 240 seconds) | 10:37 | |
*** ricardoamaro (~ricardoam@drupal.org/user/74228/view) has joined #wikid | 10:38 | |
*** ricardoamaro has quit (Ping timeout: 264 seconds) | 10:48 | |
*** ricardoamaro (~ricardoam@drupal.org/user/74228/view) has joined #wikid | 10:49 | |
*** ricardoamaro has quit (Remote host closed the connection) | 11:03 | |
*** ricardoamaro (~ricardoam@drupal.org/user/74228/view) has joined #wikid | 11:03 | |
*** joevano has quit (Ping timeout: 264 seconds) | 14:05 | |
*** joevano (~joevano@bzflag/developer/JoeVano) has joined #wikid | 14:05 | |
*** nowen (~nowen@99-174-92-191.lightspeed.tukrga.sbcglobal.net) has joined #wikid | 14:28 | |
laszlof | https://plus.google.com/u/0/107271417393296791459/posts/P4DkYp9xbu9?pid=6111240239850136322&oid=107271417393296791459 | 14:31 |
---|---|---|
laszlof | got a little snow | 14:31 |
nowen | woah | 15:11 |
laszlof | ya.. 16" in 24 hours | 15:13 |
laszlof | 3rd biggest snow storm in history for this area | 15:13 |
laszlof | office is empty today | 15:14 |
laszlof | haha | 15:14 |
nowen | do you have 4wd? | 15:32 |
laszlof | no, I have a car. | 15:32 |
laszlof | a small car | 15:32 |
laszlof | driving to the office was fun | 15:32 |
laszlof | :) | 15:32 |
*** ricardoamaro has quit (Quit: Leaving.) | 17:57 | |
*** AccentureDan (3f7c1664@gateway/web/freenode/ip.63.124.22.100) has joined #wikid | 18:48 | |
AccentureDan | hey nick | 18:48 |
AccentureDan | quick question | 18:48 |
nowen | hey AccentureDan | 18:48 |
nowen | ok | 18:48 |
AccentureDan | seeing some weird errors in WiKID | 18:48 |
AccentureDan | 2015-02-02 10:40:16.869ERRORcom.wikidsystems.client.wClientERROR: java.io.IOException: failed to decrypt safe contents entry: javax.crypto.BadPaddingException: Given final block not properly padded | 18:48 |
AccentureDan | thoughts? | 18:49 |
nowen | what changed? | 18:49 |
AccentureDan | well, after we finished with our auto-failover scripting...the WiKID servers remain up and in an active-passive pair...but we have been holding sessions to make sure users can request OTPs from WiKID | 18:50 |
AccentureDan | with a bunch of users trying to request OTPs around the same time | 18:50 |
AccentureDan | the WiKD service becomes hung | 18:50 |
AccentureDan | i have to restart the services on the WiKID master server to get it to produce OTPs again | 18:51 |
nowen | what do you mean by 'holding sessions'? | 18:51 |
AccentureDan | com.mchange.v2.resourcepool.BasicResourcePool@736d1c2f -- an attempt to checkout a resource was interrupted, and the pool is still live: some other thread must have either interrupted the Thread attempting checkout! | 18:51 |
AccentureDan | getting this as well | 18:51 |
nowen | what version of WiKD? | 18:51 |
AccentureDan | holding sessions where we get 20-30 of the people who are registered to retry requesting one time passcodes | 18:51 |
AccentureDan | just in case they forgot their PIN or password for the token client | 18:51 |
nowen | ok | 18:51 |
AccentureDan | 4.0 build 0-b1803 | 18:52 |
AccentureDan | i know we need to upgrade haha | 18:52 |
AccentureDan | have you seen these issues before? | 18:52 |
nowen | it is most likely the memory leak we fixed in b1817 | 18:53 |
AccentureDan | gotcha...so think that is related to the hanging we are seeing? | 18:53 |
nowen | can you run 'locate java.security' and post or email the one that is not in /opt/WiKID? | 18:53 |
AccentureDan | sure | 18:53 |
AccentureDan | [root@pdlptoinf04 WiKID]# locate java.security /opt/WiKID/WiKIDbackup_111714/conf/templates/java.security /opt/WiKID/conf/templates/java.security /opt/WiKIDbackup_111714/conf/templates/java.security /opt/WiKIDbackup_12162014/conf/templates/java.security /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/security/java.security.rpmsave /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.34.x86_64/jre/lib/security/java.security /usr/lib/jvm/j | 18:53 |
AccentureDan | sorry about that | 18:54 |
AccentureDan | that is what I am seeing | 18:54 |
AccentureDan | when i run that | 18:54 |
AccentureDan | on the master WiKID server | 18:54 |
nowen | ok, what is in usr/lib/jvm/java-1.6.0-openjdk-1.6.0.34.x86_64/jre/lib/security/java.security? | 18:54 |
nowen | or run 'diff /opt/WiKID/conf/templates/java.security /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.34.x86_64/jre/lib/security/java.security' | 18:55 |
nowen | is there a difference in the files? | 18:55 |
AccentureDan | lemme email those commands | 18:56 |
AccentureDan | outputs* | 18:56 |
AccentureDan | okay sent | 18:58 |
nowen | the pool error and not being able to get OTPs should be fixed in the upgrade | 18:58 |
AccentureDan | this was what i got when i did the diff | 18:58 |
AccentureDan | gotcha so probably seeing an issue in this version. not related to the changes we made, which is what i thought | 18:58 |
nowen | how much RAM in each server? | 18:58 |
AccentureDan | brb one sec | 18:58 |
AccentureDan | back | 19:03 |
AccentureDan | 65 GB | 19:04 |
AccentureDan | per machine | 19:04 |
AccentureDan | dont ask LOL | 19:04 |
nowen | of RA<? | 19:04 |
AccentureDan | yep | 19:04 |
laszlof | christ | 19:04 |
AccentureDan | i know LOL | 19:04 |
AccentureDan | they were repurposed Oracle X3-2's | 19:04 |
laszlof | overkill much? | 19:05 |
AccentureDan | beyond overkill | 19:05 |
AccentureDan | dont even ask about available storage LOL | 19:05 |
laszlof | 24PB? | 19:05 |
laszlof | ;) | 19:05 |
AccentureDan | LMAO | 19:05 |
nowen | ok - run 'cp /opt/WiKID/conf/templates/java.security /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.34.x86_64/jre/lib/security/java.security' | 19:05 |
AccentureDan | 1 TB, close enough :) | 19:05 |
AccentureDan | will do | 19:05 |
nowen | and 'wget http://wikidsystems-dl.com/wikid-server-enterprise-4.0.1.b1821-1.noarch.rpm' | 19:05 |
nowen | and 'rpm -Uvh wikid-server-enterprise-4.0.1.b1821-1.noarch.rpm' | 19:06 |
nowen | on both servers and restart | 19:06 |
nowen | seems like it would take a long time for a memory leak to get noticed | 19:06 |
AccentureDan | okay downloading | 19:07 |
AccentureDan | will have to update either tonight or tomorrow after these trainings are done...should be fun...you should see the amount of work we put in to making sure these master and slave servers auto-failover | 19:13 |
AccentureDan | its disgusting | 19:13 |
AccentureDan | obviously customized for our environment and contractual obligations | 19:13 |
nowen | no doubt | 19:14 |
AccentureDan | as promised i will annotate our changes and forward on what we did...we created a solution for failing over, utilizing crontab, and made some modifications to the code itself...including using a master script...all running from a secondary server with a separate instance of WiKID | 19:16 |
nowen | cool | 19:16 |
*** AccentureDan has quit (Ping timeout: 246 seconds) | 19:32 | |
*** nowen has quit (Quit: Leaving.) | 22:23 |
Generated by irclog2html.py 2.11.0 by Marius Gedminas - find it at mg.pov.lt!