1.13.2019

学会要旨の顛末 Abstract submission for a conference in the midst of government shutdown

5月の学会に出そう!ということでやってきたDNA解析、いろんな人の助けのおかげで何とか発表要旨を書くに漕ぎ着けた。感動。

昨年3月に植えて、葉っぱを取ってDNAを抽出し、そこからサンプル調整して、シーケンスに出し、20GB×4のデータとなって返ってきた。そのままではただの数百万本のDNAのかけらなので、それをいろんなソフトを使いUNIXに翻弄されながら一塩基多型変異が何個あるか、どこにあるか、という情報を得る。その情報から、この株たちが遺伝的にどう近いかが分かったり、必要形質の発現に重要な遺伝的変異がわかったりする。

そのDNAを処理するというのが極めて難航した〜〜 UNIX上で、コマンドライン(黒い画面の)を使ってスパコンを操作するのも慣れないし、かけら達から使えそうなもの達だけを篩い分けたり、それらをいんげん豆既知ゲノム配列に並べたり等等、書けば簡単そうだけど一回一回躓く。これがあればできるよというコードがあるにも関わらず…。必要なソフトを立ち上げていなかったり、Javaのバージョンが違うとか、ディレクトリが違ってるとか…が原因なわけですが、どこが違ってるのか自分で分かるようになるまでが長かった。実にいろんな人にお世話になりました。

300サンプルあるので1つの処理に時間もかかる。スパコンは昼夜働くのですがそれでもゲノムに並べるだけで8時間とか。そんな感じで冬休みまで一週間強というところで開始して2日目くらいまでは全く思うように進まず、「これ…諦めたほうがいいのでは…。むっちゃ効率悪い。この時間を使って別のことをしたほうがよほど良いような? バイオインフォマティクスに詳しい人他にいるわけだし…」とか思っておりましたが、悪戦苦闘をとりあえず継続。エラー対処法が少しずつ分かるようになり、自分であとはできるわ多分、というところまで進んだ。ものすごく学びの多い一週間だった。

最初が一番難しい、というのはバイオインフォマティクスの他と違うところだ。他の科目や言語では、簡単なことからはじめ、知識が指数関数的に最初は増えていき、徐々に緩やかになっていく。でもこの情報処理は、最初何が何やら全く分からずで学習曲線は0付近を漂う。だんだん分かってくると曲線は徐々に上向き始めるがとてもゆっくり。忍耐が大事。 

要旨書いてギリで指導教官に見てもらい、よっしゃ出せる!というところで、今回の政府シャットダウン。お金を使えないため学会参加登録できない……参加登録できないと要旨提出できない………というカベが。自分のクレジットカードを使うのは後で面倒だから最後の手段。(結局精算時に「なぜシャットダウン中に予算を使うことになるようなことをしたのか?」と言われるため、らしいが、そんなお金のない空白状態が20日以上続いてまともな仕事ができるかぁ〜!とツッコみたい) 違う教授の技官の人にお願いするも、米ドル以外の通貨だから取引できず(大学の方針らしい)、学会主催者に、米国政府の状況がこうでこうで…と、事情を説明して後払いができないかお願いしてみた。すると思いの外さらりと特別措置を講じてくれた。なんとありがたい……かくして無事に参加登録と要旨提出完了。学会に行けそうだ。要旨提出締切、6時間前のことだった。今度はもっと余裕を持ちたい。

後日談。
締切日翌日、ウェブサイトを見たら要旨提出締切が2/1まで伸びていた(!)。あと6時間なのに!ってやきもきしたのに。まあいいか… もっと解析進めて、要旨に結果加筆できるわ…

I was finally able to write an abstract for a poster presentation at the International legume conference in May! The registration for abstract submission was due on 1/11, and I tried to analyse the DNA sequencing data of my population the whole winter break. It was intense, and I almost thought I won't make it because it was too challenging - but I did!!

The DNA sequencing data analysis starts with QC'ing the reads - are your DNA reads any good? Only good quality reads should be used. This step was relatively easy because I have done this in a class assignment before. But the system upgrade of the super computer just happened two months ago messed up the job-queueing system, so I had to rewrite some of my codes to meet the new job scheduler requirements. This wasn't too bad. Then the reads are classified into each sample - each variety that was sequenced. Because the DNA samples were sequenced in bulk, the reads needed to be sorted out according to which genotype they belonged to. It was tough to get accustomed to a new software. The person who did the DNA prep with me has a long experience on bioinformatics, and he said "Download this java file and run this script, you won't have to do anything else". But I had to do many other things to make this work - such as building the software platform and loading correct java! To be fair, he did not remember that because he has built the necessary platform long time ago. Once you get used to something, it's hard to know what beginners want to know.

The most difficult thing is that there is little helpful info online. I just had to ask around to ask very specific questions. It's really hard when you don't know what is wrong or what should be done. But this deconvolution step taught me a lot - I was moving forward to the stage where I understand the command and troubleshoot, not just running whatever I was told to run without really knowing what I was doing. The first step was the most difficult, then things got easier. The reads were aligned to the reference genome, variants were called, lines were genotyped, and the final set of variants were obtained!! That was a intense week with tons of learning. I am so thankful to everyone that helped me along the way!

With 300 samples, the computation is heavy at each step. The university's supercomputer works day and night, and it's not rare that one operation takes more than 8 hours. At one point, the SNP discovery step felt too long and too much for me considering how little progress I had made the past 2 days, I thought about giving up and doing other things. Wouldn't it be more efficient if I did other things I can do more efficiently and let the other person do the bioinformatics as he is more skilled...? Anyway I continued, stumbling on everything, but I'm glad I did. I learned to be persistent and patient!! It's an important skill to get used to new tools quickly. Tools evolve so fast that we need to change our methods often. I want to be able to grasp the concept and capability of what a software offers, and be able to learn how to operate it quickly.

With all that toil and moil, I hit another wall - government shutdown. Government employees are on a furlough (temporary unemployment) and are not supposed to work or spend money. But the registration due date is on Friday! We asked a technician at one of the labs that belongs to the university to pay for us, but the purchase card was declined because the registration fee was in euro, not in US dollar. How do people register for an international conference with this too tight a security system?? Using my own credit card was the last resort because it will take forever to get reimbursement. The reason for this is that we are not supposed to do any activity that will incur government money or payment before the federal funding is restored, so they will ask "why did you pay something during this period of time?" and make the reimbursement paperwork so complicated and cumbersome. Well, how could you expect us to accomplish anything meaningful after paralysing our laboratory activity for more than 20 days??? The second to the last resort was to ask the conference organiser to let me register now and pay later. I emailed them explaining the current catch-22 situation. Surprisingly, they were OK with me registering now without paying! Is that why they are French? They must have been knowledgeable about politics and pitiful about the ongoing chaos in the US. Salute to the country of French revolution and Montesquieu.

It was 6 hours before registration closed that I was able to register and submit my abstract. It was truly a last-minute submission. I'm glad I can go to the conference! With tons of excuses aside, I will register well in advance next time.

It was funny that the abstract submission deadline has been extended to 2/1!!! What was all the rushing for??? I will polish my abstract and maybe add some more results till the end of this month.. 


No comments:

Post a Comment