OpenHelix

Visit Blog Website

199 posts · 151,684 views

On the OpenHelix blog you will find a genomics resources news portal with daily postings about genomics and bioinformatics resources, genomics news and research, science and more. Our goal is to keep you, the researcher, informed about the overwhelming amount of genomics data out there and how to access it through the tools, databases and resources that are publicly available to you.

Mary
86 posts

Trey
4 posts

Jennifer
35 posts

Sort by: Latest Post, Most Popular

View by: Condensed, Full

  • April 13, 2011
  • 09:16 AM
  • 1,014 views

Tip of the week: VirusMINT

by Jennifer in OpenHelix


In this week’s tip I’d like to introduce you to VirusMINT. We found VirusMINT during our ‘regularly scheduled’ update of our Introductory tutorial on MINT, or the Molecular INTeraction database. We really like MINT for all the great interaction information they provide on a wide variety of species. When we saw they had a ‘virally focused’ database, we had to check it out.
It turns out that VirusMINT is really unique in that it shows interactions BETWEEN human and viral proteins, all on the same interaction map. PLUS, from the VirusMINT homepage description:
VirusMINT uses the PSI-MI standard and is fully integrated with the MINT database.
You can either search for any viral or human protein by entering either common names or database identifiers in the form in the left frame or display a complete viral interactome by pressing the corresponding button in the frame below.

I only had time to show you the most basic VirusMINT features in this short movie. After you watch it, be sure to head over to MINT & check out all their great features, which currently includes 4 sister databases: MINT, HomoMINT (an inferred human network), Domino (a domain peptide interactions database) and VirusMINT.  These four databases are a really nice protein interaction resource because each offers a clean set of information on important areas of protein interactions and are all integrated with one another.  The MINT databases use PSI_MI standard formatting to capture curated protein interaction information from literature & direct user submissions.  Not only is data integrated across each of the four databases, each database provides interactive viewers for visually displaying the data. Outputting and downloading the data is also possible.
For more details of their full functionality: consider checking out our full MINT tutorial – available through a subscription to our full database of training materials (currently on sale); purchasing an individual access to the tutorial; or checking out the references listed below.
References:

Chatr-aryamontri, A., Ceol, A., Peluso, D., Nardozza, A., Panni, S., Sacco, F., Tinti, M., Smolyar, A., Castagnoli, L., Vidal, M., Cusick, M., & Cesareni, G. (2009). VirusMINT: a viral protein interaction database Nucleic Acids Research, 37 (Database) DOI: 10.1093/nar/gkn739


Chatr-aryamontri, A., Ceol, A., Palazzi, L., Nardelli, G., Schneider, M., Castagnoli, L., & Cesareni, G. (2007). MINT: the Molecular INTeraction database Nucleic Acids Research, 35 (Database) DOI: 10.1093/nar/gkl950


Ceol, A., Chatr Aryamontri, A., Licata, L., Peluso, D., Briganti, L., Perfetto, L., Castagnoli, L., & Cesareni, G. (2009). MINT, the molecular interaction database: 2009 update Nucleic Acids Research, 38 (Database) DOI: 10.1093/nar/gkp983





... Read more »

Chatr-aryamontri, A., Ceol, A., Peluso, D., Nardozza, A., Panni, S., Sacco, F., Tinti, M., Smolyar, A., Castagnoli, L., Vidal, M.... (2009) VirusMINT: a viral protein interaction database. Nucleic Acids Research, 37(Database). DOI: 10.1093/nar/gkn739  

Chatr-aryamontri, A., Ceol, A., Palazzi, L., Nardelli, G., Schneider, M., Castagnoli, L., & Cesareni, G. (2007) MINT: the Molecular INTeraction database. Nucleic Acids Research, 35(Database). DOI: 10.1093/nar/gkl950  

Ceol, A., Chatr Aryamontri, A., Licata, L., Peluso, D., Briganti, L., Perfetto, L., Castagnoli, L., & Cesareni, G. (2009) MINT, the molecular interaction database: 2009 update. Nucleic Acids Research, 38(Database). DOI: 10.1093/nar/gkp983  

  • March 30, 2011
  • 01:07 AM
  • 1,529 views

Tip of the Week: MetaPhoOrs, orthology and paralogy predictions

by Trey in OpenHelix

The researchers and developers at PhylomeDB haven’t rested on their laurels. I did a tip of the week on PhylomeDB 3 months ago and not too long ago I was checking over there and found the team had created another useful database and analysis tool, MetaPhoOrs. What is MetaPhoOrs? To quote from the homepage:
MetaPhOrs is a public repository of phylogeny-based orthology and paralogy predictions that were computed using resources available in seven popular homology prediction services (PhylomeDB, EnsemblCompara, EggNOG, OrthoMCL, COG, Fungal Orthogroups, andTreeFam).
The research article on their methodology published in NAR (online 12/10) will give you a better understanding how these orthology and paralogy predictions are made. Basically, MetaPhOrs uses phylogenetic orthology and paralogy predictions from several sources. These phylogenies overlap:
Since many of these repositories overlap, partially, in terms of genomes covered, it is often the case that phylogenetic information regarding a pair of proteins can be found in several databases.
Moreover, these phylogenies are built with different protein sets, parameters and methodologies.
Such level of information redundancy can be exploited to assess the robustness of a given orthology or paralogy prediction to changes in the phylogenetic settings…. Intuitively, a prediction that is not affected by such settings will be considered more reliable.
MetaPhOrs uses this information to predict orthologs and paralogs for protein pairs with a consistency score (CS, “the fraction of trees predicting an orthology relationship over the total of trees considered”) and a evidence level (EL, “how many independent sources have been used for the prediction”). CS for orthologs ranges from 0 (all trees predict paralogy) to 1 (all trees predict orthology). Take a look at the paper for more information on this methodology and results.
To date, the database uses over 700,000 phylogenies from several sources to predict over 300 million homologous protein pairs from over 800 fully sequenced genomes. They plan to regularly update and add more phylogenetic and protein data.
Today’s tip spends 5 minutes going over the database and showing you how to access these predictions.
Pryszcz, L., Huerta-Cepas, J., & Gabaldon, T. (2010). MetaPhOrs: orthology and paralogy predictions from multiple phylogenetic evidence using a consistency-based confidence score Nucleic Acids Research, 39 (5) DOI: 10.1093/nar/gkq953




... Read more »

  • March 28, 2011
  • 09:15 AM
  • 1,491 views

Protein Structure Analysis – How Far We’ve Come!

by Jennifer in OpenHelix

The team here at OpenHelix has recently updated our sponsored tutorials on two excellent structural biology resources, the RCSB Protein Data Bank (PBD) and the PSI-Nature Structural Biology Knowledgebase (PSI SBKB). Because the tutorials are sponsored by these resources they are free for anyone to view and download in full. You can access our training materials for the resources at our RCSB PDB landing page, or our PSI SBKB landing page. I’m very happy with both tutorial suites, so please check them out.
As my personal celebration for these releases I have been reading a variety of articles showing the scope of how far our abilities to analyze protein structures have come. The first article is one that Mary pointed me to a while back, which discusses the infancy of bioinformatics, entitled “The Roots of Bioinformatics in Protein Evolution” by RF Doolittle (cited below, as are all articles mentioned). In this wonderful perspective Dr. Doolittle describes a time when DNA sequencing was unimaginable and protein sequencing was laborious, slow, and yet so new that each day was full of excitement as one more amino acid was identified. It is a revealing glimpse at a research era gone by – to quote Doolittle, “Science as an endeavor thrives on obsolescence.” – and mentions the contributions of Margaret Dayhoff, who Mary has blogged about.
The next historical article that I read was entitled “The Early Years of Retroviral Protease Crystal Structures” by M Miller (freely available on PMC). As you can tell from the title, this covers a time more recent than the Doolittle article, when protein crystallization studies were possible. Dr. Miller traces the X-ray crystal studies of retroviral proteases at the NCI-Fredrick in the late 1980′s and early 1990′s, and she describes how chemical synthesis of HIV1-PR was critical to obtaining enough protein for crystallization and how the crystal structure of it (deposited into the PDB archive and therefore freely available for all researchers to study) was invaluable for the design of inhibitors of HIV1-PR as anti-AIDS drugs.
I’ve also be perusing more recent papers that highlight how protein structures can aid biological investigations. These include: “Structure of mammalian AMPK and its regulation by ADP“,  “Bioinformatics analysis of disordered proteins in prokaryotes“, “Crystal structure of inhibitor of κB kinase β” and others. It would also be fun to attend “The 25th Annual Meeting of the Groups Studying the Structures of AIDS-Related Systems and Their Application to Targeted Drug Design” to learn more, but alas I will not be in the area at the time of the meeting. As I’ve posted before, I am a geneticist by education. To me seeing the development of protein studies (through the historical reviews) and the studies currently occurring in the field of structural biology, combined with the amazing offerings available freely through both the RCSB PDB and the PSI SBKB really does feel like an appropriate, and enjoyable, celebration for the completion of our tutorial updates. Let me know what you think about them, when you get a chance!
References:

Berman, H. (2000). The Protein Data Bank Nucleic Acids Research, 28 (1), 235-242 DOI: 10.1093/nar/28.1.235


Berman, H., Westbrook, J., Gabanyi, M., Tao, W., Shah, R., Kouranov, A., Schwede, T., Arnold, K., Kiefer, F., Bordoli, L., Kopp, J., Podvinec, M., Adams, P., Carter, L., Minor, W., Nair, R., & Baer, J. (2009). The protein structure initiative structural genomics knowledgebase Nucleic Acids Research, 37 (Database) DOI: 10.1093/nar/gkn790


Doolittle, R. (2010). The Roots of Bioinformatics in Protein Evolution PLoS Computational Biology, 6 (7) DOI: 10.1371/journal.pcbi.1000875


Miller, M. (2010). The early years of retroviral protease crystal structures Biopolymers, 94 (4), 521-529 DOI: 10.1002/bip.21387





... Read more »

Berman, H. (2000) The Protein Data Bank. Nucleic Acids Research, 28(1), 235-242. DOI: 10.1093/nar/28.1.235  

Berman, H., Westbrook, J., Gabanyi, M., Tao, W., Shah, R., Kouranov, A., Schwede, T., Arnold, K., Kiefer, F., Bordoli, L.... (2009) The protein structure initiative structural genomics knowledgebase. Nucleic Acids Research, 37(Database). DOI: 10.1093/nar/gkn790  

  • March 23, 2011
  • 09:08 AM
  • 1,237 views

Tip of the week: ORegAnno for regulatory annotation

by Mary in OpenHelix


Lately we’re getting a lot of questions about ways to analyze the promoters and other regulatory aspects of genes. And for a while we were mostly pointing to the prediction data that was available in the UCSC Genome Browser’s TFBS Conserved track. TFBS Conserved is a track of computationally predicted transcription factor binding sites (TFBS) which are conserved across human/mouse/rat and based on Transfac v7.0 by BioBase.  As they say in the track description, it’s important to know this:
The data are purely computational, and as such not all binding sites listed here are biologically functional binding sites.
Though this is useful, people have been wanting more evidence based on real binding and/or activity data. Today’s tip will talk about 2 ways to get other data–beyond computational predictions. First we’ll explore ORegAnno so you’ll understand the data sources, and then we’ll also look at that data in the context of the UCSC Genome Browser and some useful data from the ENCODE project.
ORegAnno is the Open Regulatory Annotation Database, a community literature curation project for regulatory information. Anyone can participate in the curation–they provide helpful curation tools and automated cross-linking and checking features that make it easier. You would register, curate, and the data becomes available to anyone. And with the curator tools that are available the data becomes loaded into projects that coordinate with ORegAnno–including the track at the UCSC Genome Browser of ORegAnno data.
In the paper published in NAR 2008, they stated this:
The current release comprises 30 145 records curated from 922 publications and describing regulatory sequences for over 3853 genes and 465 transcription factors from 19 species.
So that’s a nice set with traceable data that’s not just computational predictions. In the tip I’ll show one example of Stat1 binding, in human, near the Il10 gene. If you look at that record, you’ll see several pieces of evidence that support this data and a link to the publication that offers it.
Now, if you look at ORegAnno data over in the UCSC Genome Browser, you could compare it to the computational predictions, or TFBS data from other projects such as the ENCODE data sets with the Chip-Seq data (Yale TFBS and HAIB, for example; note: you may have to go back an assembly because the ENCODE data is not all on the current assembly at this time). This is what I show in the movie: I take an ORegAnno annotated item, visualize that with the TFBS Conserved predictions and with some ENCODE project data.  So you get all 3 types of data with a few clicks.
So there are several ways to look for TFBS data–some of it computational predictions, some literature curation, and some big data stuff from the ENCODE teams. All of them have strengths and caveats. Computational predictions may be genome wide and independent of a given cell or tissue type, but are subject to the constraints of the algorithms. Community literature curation can offer quality evidence, but may be selected by interested groups and not as broadly representative of the genome-wide situation. Big data projects can be genome-wide and have evidence in some cell types, but may be in progress and subject to checking as they are pre-publication data.  But effectively using them all could help you to understand regulation of genes that you might be interested in.
Quick Links:
ORegAnno: http://www.oreganno.org/
Biobase and Transfac: http://www.gene-regulation.com/pub/databases.html
UCSC Genome Browser: http://genome.ucsc.edu/
ENCODE data at UCSC: http://genome.ucsc.edu/ENCODE/
Reference:
Griffith, O., Montgomery, S., Bernier, B., Chu, B., Kasaian, K., Aerts, S., Mahony, S., Sleumer, M., Bilenky, M., Haeussler, M., Griffith, M., Gallo, S., Giardine, B., Hooghe, B., Van Loo, P., Blanco, E., Ticoll, A., Lithwick, S., Portales-Casamar, E., Donaldson, I., Robertson, G., Wadelius, C., De Bleser, P., Vlieghe, D., Halfon, M., Wasserman, W., Hardison, R., Bergman, C., Jones, S., & The Open Regulatory Annotation Consortium. (2007). ORegAnno: an open-access community-driven resource for regulatory annotation Nucleic Acids Research, 36 (Database) DOI: 10.1093/nar/gkm967




... Read more »

Griffith, O., Montgomery, S., Bernier, B., Chu, B., Kasaian, K., Aerts, S., Mahony, S., Sleumer, M., Bilenky, M., Haeussler, M.... (2007) ORegAnno: an open-access community-driven resource for regulatory annotation. Nucleic Acids Research, 36(Database). DOI: 10.1093/nar/gkm967  

  • March 18, 2011
  • 10:15 AM
  • 1,204 views

“On the organization of #bioinformatics core services…”

by Mary in OpenHelix

Paul Blaser tweets this:
On the organization of #bioinformatics core services in biology-based research institutes – http://goo.gl/zFjjD
It’s an Advanced Access editorial item in Bioinformatics, and it describes some key features of bioinformatics support service organizations. And can I just say, AMEN!
As we do workshops around the country, we see a wide–WIDE–range of what bioinformatics support means. It can be one person in the library, to a large “big data” group that is also supposed to support local researchers somehow, while doing their own research. And don’t even get me started on the big name institutions that have really left their staff drift with essentially no support. Sigh. And the range of needs is huge–from microscopy imaging to handling next-gen data and everything in between. The species range is huge–and sometimes meta. But if we could get the average bench biologists up to speed on some tools they could solve a lot of their own issues. And then be more effective users of advanced support with more complex questions that go further.
Anyway: this is a nice piece that people should use as support to create a bioinformatics core. The ideas look quite sound to me. And my favorite parts were:
6. One of the key missions of the bioinformatics is to provide training to
biologists at a basic level….
[and]
7. It is useful to nominate a bioinformatics support person, whose task is
to guide users in the use of public bioinformatics tools and databases
as well as tools and methods developed within the institute.
And we can help with this, you know. You don’t have to create all these materials yourself. There is no reason that everyone around the world needs to re-create Introduction to the UCSC Genome Browser. You could roll out a weekly seminar on a tool starting tomorrow using our slides. Save your energy for the custom questions and the local issues.  Or you can learn from the Khan Academy experience that he recently described in his TED talk–people like the movies because they can learn on their own time:  Salman Khan: Let’s use video to reinvent education. And with this model you don’t spend time doing the lecturing. They’ve seen the lecture, you spend the time on the exercises (which we give you, and you can supplement with local issues) and their questions….

The “embedded” person described in the editorial is a cool position for someone too. I’ve been in that role, and I really liked being the bridge between the biology and the computational side. And here’s a pro-tip: people don’t like to feel like they are imposing on your time if you aren’t a designated support person. And they can tell when you are talking down to them, but they often need more basic level info that some people want to provide. Also: there’s a big difference between giving them theory (which is fine), and actually using the software tools. They’ve told us that is how their local folks sometimes make them feel.  It’s funny to come into the situations as outsiders, actually–people tell us stuff they don’t tell their colleagues
Kallioniemi, O., Wessels, L., & Valencia, A. (2011). On the organization of bioinformatics core services in biology-based research institutes. Bioinformatics DOI: 10.1093/bioinformatics/btr125




... Read more »

  • March 15, 2011
  • 02:55 PM
  • 1,275 views

My Allergy Gene: Cool. And Wut?

by Mary in OpenHelix

Wherein I #FAIL personal genomics….
One of the medical issues of personal interest to me is allergy. It’s something I pay attention to all the time because I have an allergy that can really cause me problems–peanuts. But I come from a family with a whole host of additional allergy issues too. All of my siblings have had serious problems with eczema. Their cracked and raw winter hands can be significantly painful, and I feel for them. Sometimes I felt like the peanut allergy was a better bargain–not having a Reese’s peanut butter cup is just not that big a deal, ya know? Luckily I haven’t been fooled by any peanut flour that was able to kill me–so far.
So I when I read the recent article in The Scientist on The Allergy Gene I was totally engrossed. It’s not your typical research article–it’s the story of the work and the outcome in a narrative way. It’s about how this group went after an incredibly challenging gene–filaggrin (FLG). It is a cytoskeletal protein–an intermediate filament class one. As my PhD project was on cytoskeleton I’ve also had a professional interest in these too.  It’s also a massive mRNA, making a “monster” pre-protein–making it rather unusual. And nearly the whole thing is coded by a single exon. That’s even stranger. There’s also some really heinous repetitive stretches–a sequencing nightmare. So from a genomics/sequencing/gene structure perspective–wow! That’s a cool story.
But it makes sense to me that if the family epidermis or other surfaces are a bit more porous that some other people’s, that could explain our contact allergy issues.
So FLG is compelling, curious, and intriguing to me from a number of perspectives. I was planning to write this up anyway because I thought people I know would be interested in the allergy gene–when just last week I saw another fascinating piece come along, from the BBC of all places:  Scientists claim peanut allergy ‘gene flaw’ link. Guess which protein it is? FLG. Bam!
Like most gene-link stories, this will only be part of the explanation. But it looks like it could represent a significant number of the cases of peanut allergy with a filaggrin defect.
I began to wonder to myself: what does my filaggrin gene look like? I could look it up using the 23andme data. But what SNPs will that have? If this is a nasty and repetitive region, how good are these SNPs? And what are the key pieces of FLG that I need to know about?
So I turned to the 23andme data set, looking for FLG. I get a list of SNPs. Some of them have dbSNP ids. Some of them don’t. Ok… [PS: I'm not disclosing my personal SNP data here as I don't think the security and protection of my information is sufficient; I have disclosed my peanut allergy because it is already part of my medical record anyway and can't hurt me any more. ]
Now what? I have to go look at the positions specifically in the UCSC Genome Browser to get a better handle on this, especially for the non-dbSNP ones. Ok–I write up a quick custom track with my data (which I’m not showing you).
Whoa–look at that odd FLG gene region in the UCSC Genome Browser (load up my UCSC Genome Browser session showing this region by clicking here). There’s that massive exon, and check out the conservation track. That is one of the strangest looking gene situations I’ve seen–and I’ve seen a lot of genes.  Bizarre.
It’s loaded with SNPs–many coding non-synonymous changes. If you open the SNPs 131 to “pack” mode, you’ll find one of the biggest SNPs I’ve ever seen. Look at the red box for rs71770072. It’s a “large deletion” of almost 1kb (but PS: I have seen larger).  Some of the SNPs match the 23andme data. Most don’t. There are just a couple of handfuls of 23andme data.
But fine–I’m looking around now for other tidbits of information. Let’s see what we can learn about the mutations from the FLG paper. I check out the peanut allergy paper, which is fine. But they refer to previously characterized mutations, so I have to hunt back some more. On my hunting expedition some papers are behind pricey firewalls, so I don’t go for those. But I found one paper in The Journal of Investigative Dermatology that had a summary diagram of the mutations. Psyched!
I check out the figure with the mutations illustrated. Ah, damn. They are all cryptic names–protein position related, not handy dbSNP ids. No idea what assembly/version of this protein sequence. Sigh. So I start tracking them down. I’m checking OMIM. EntrezGene. Leiden Open Variation Database. HGVS.  Psyche! Nothing really useful here. Sigh.

I’m coming up with nothing that helps me to understand the relationship of the new data, the  old data, and my personal genomics data. And I know what to look for. It’s been a pretty frustrating few days. And I have a flashlight. Most people who open up their personal genomics data right now are in the dark, without a flashlight. We’re so not ready.
If anyone else takes a look a their FLG and figures out what mutations match the clinical data, let me know. I’m going to keep looking, but I have other stuff I need to do. I don’t have time for this, for one gene, for now. Sigh.
I #FAIL personal genomics. I have no idea what my filaggrin gene alleles correspond to.  That’s embarrassing.
Scientific References:
Sandilands, A., Smith, F., Irvine, A., & McLean, W. (2007). Filaggrin’s Fuller Figure: A Glimpse into the Genetic Architecture of Atopic Dermatitis Journal of Investigative Dermatology, 127 (6), 1282-1284 DOI: 10.1038/sj.jid.5700876
Brown, S., Asai, Y., Cordell, H., Campbell, L., Zhao, Y., Liao, H., Northstone, K., Henderson, J., Alizadehfar, R., & Ben-Shoshan, M. (2011). Loss-of-function variants in the filaggrin gene are a significant risk factor for peanut allergy Journal of Allergy and Clinical Immunology, 127 (3), 661-667 DOI: 10.1016/j.jaci.2011.01.031



... Read more »

Brown, S., Asai, Y., Cordell, H., Campbell, L., Zhao, Y., Liao, H., Northstone, K., Henderson, J., Alizadehfar, R., & Ben-Shoshan, M. (2011) Loss-of-function variants in the filaggrin gene are a significant risk factor for peanut allergy. Journal of Allergy and Clinical Immunology, 127(3), 661-667. DOI: 10.1016/j.jaci.2011.01.031  

  • March 9, 2011
  • 09:15 AM
  • 934 views

Tip of the Week: World Tour of Genomics Resources

by Jennifer in OpenHelix

Most weeks our tip is a five-minute movie that quickly introduces you to a new resource, or a cool new function at an established resource. Occasionally we feature one of our full resource tutorial that is being made freely available through resource sponsorship of our training suite. In this week’s tip we provide access to one of our tutorials that is especially near and dear to our heart. It is a World Tour of Genomics Resources in which we explore a variety of publicly-available biomedical, bioinformatics and bioscience databases and other resources.
This tutorial is quite different from our usual ones. Generally we focus on a specific software resource and describe step-by-step how to use its functions such as how to do basic and advanced searches, how to understand and modify displays, where to find specific types of data such as FASTA sequences, etc. and even provide tips on ‘hidden features’ that power users even find useful and informative.  This type of software training is absolutely critical.
But many people need an even earlier step: just the *awareness* that resources are available that might serve their needs. This tutorial fills that niche. We present a sampling of resources, all free to use, from each of 9 categories including: Analysis & Algorithms, Expression, Genome Browsers (for Eukaryotes and for Prokaryotes and Viruses), Genome Variation,  Literature, Nucleotides, Pathways and Proteins. After the World Tour, which is the majority of the tutorial, we then describe how to use OpenHelix’s free search and learn portal to find bioscience resources most appropriate for your research needs. From this the tour transitions into a brief discussion of the format of our training materials and how to use them, and then ends with information about other learning resources that we provide.
This tutorial has been wildly popular whenever we’ve done it as a live seminar. At the NIH they actually had to lock the doors because we’d hit the capacity of the room, and people were turned away. In fact, it has been so popular that we decided to produce it as a full tutorial suite and release it as one of our free trainings so that anyone and everyone could learn about the breadth of great public software options available for free use.
In addition to this free tutorial, we also have published a paper entitled “OpenHelix: bioinformatics education outside of a different box” in a special issue of Briefings in Bioinformatics entitled “Special Issue: Education in Bioinformatics“. This paper describes a plethora of sources where researchers can access informal educational sources of learning on publicly available bioinformatics resources. The sources of information include a wide variety of formats including lists of resources, journals that regularly feature tool descriptions, and eLearning resources sources such as the MIT OpenCourseWare effort. If you know of other such resources that aren’t covered in our tour or paper, comment & let us know about them – we love to learn as much as we love to teach!
Quick link to World Tour of Genomics Resources tutorial here.

Williams, J., Mangan, M., Perreault-Micale, C., Lathe, S., Sirohi, N., & Lathe, W. (2010). OpenHelix: bioinformatics education outside of a different box Briefings in Bioinformatics, 11 (6), 598-609 DOI: 10.1093/bib/bbq026





... Read more »

Williams, J., Mangan, M., Perreault-Micale, C., Lathe, S., Sirohi, N., & Lathe, W. (2010) OpenHelix: bioinformatics education outside of a different box. Briefings in Bioinformatics, 11(6), 598-609. DOI: 10.1093/bib/bbq026  

  • March 2, 2011
  • 09:17 AM
  • 1,218 views

Tip of the Week: DAnCER for disease-annotated epigenetics data

by Mary in OpenHelix


Epigenetics and epigenomics are becoming more exciting areas of investigation, and we are seeing more requests for database resources to support them, and for the sources of data from these types of experiments. If you aren’t aware of these investigations at this point, check out their entries in the Talking Glossary of Genetic Terms:
Epigenetics: Epigenetics is an emerging field of science that studies heritable changes caused by the activation and deactivation of genes without any change in the underlying DNA sequence of the organism. The word epigenetics is of Greek origin and literally means over and above (epi) the genome.
Epigenome: The term epigenome is derived from the Greek word epi which literally means “above” the genome. The epigenome consists of chemical compounds that modify, or mark, the genome in a way that tells it what to do, where to do it, and when to do it. Different cells have different epigenetic marks. These epigenetic marks, which are not part of the DNA itself, can be passed on from cell to cell as cells divide, and from one generation to the next.
And for the talking part–you can hear Dr. Linda Elnitski talk about these in more detail–have a listen at each entry. And just today an article providing an epigenetics primer appeared in my inbox: Epigenetics: A Primer.
These intriguing–and sometimes puzzling–chromatin modification (CM) signals and leads that are being unveiled in many labs and projects now are becoming more widely available in different databases. For this week’s tip of the week I’ll introduce DAnCER: Disease-Annotated Chromatin Epigenetics Resource, one of the tools that is organizing this type of data and enabling additional explorations. You can find DAnCER here: http://wodaklab.org/dancer/
In the associated publication, the DAnCER team describes other useful resources that provide epigenetics data. These include ChromDB, ChromatinDB (for yeast), and the Human Histone Modification Database (HHMD), among others. I’m also aware of other sources. A few months back I introduced the NCBI Epigenomics resource as my tip-of-the-week. (At that time I promised that when the publication became available I’d mention it–that’s now at the bottom in the references section below.) There’s also quite a bit of this data flowing in to the UCSC Genome Browser ENCODE DCC. Including–may I add–some data from the very cool Elnitski bi-directional promoter studies.  You can find similar data types via the modENCODE project as well.
So, there are lots of resources out there. Each provider has different projects, species, goals, displays, etc. But the group that developed DAnCER wanted to fill a niche they didn’t see available already: linking these epigenetic changes to possible disease association data. Here’s how they describe their work:
Our research effort therefore strives to explore CM-related genes in the context of their protein-interaction network, their partnership in multi-protein complexes and cellular pathways, as well as their gene expression profiles….
They are well-suited to linking this kind of information. You may remember our previous explorations and discussions of iRefWeb. The kind of network and interaction data that they assemble in that context can be brought to the chromatin-modification arena. The point is that you can take steps beyond the modifications you know about, to explore their neighborhood of interactions, and potentially unearth important disease relationships from that.
The data includes several species, and because of that evolutionary conservation can also be explored.
So if you find that you are interested in exploring chromatin modifications, and want to take that data further, check out DAnCER, and the other tools and projects that are providing this type of information. If you have used the iRefWeb interface, you’ll see some similarities in structure. Search options with many filters are available. Color-coded and sortable results are provided. Links to gene details within the Wodak lab tools and external links are offered. On the gene pages at DAnCER you’ll have many types of annotations, including Gene Ontology descriptions, evidence type and references, neighbors, and protein domain information as well. And besides the texty-table based stuff, you can choose to load up the interactive network/interaction graphic, just like with the iRefWeb tool.
There’s a lot of opportunity to learn things from this tool. Try it out.
Quick Links and References:
DAnCER http://wodaklab.org/dancer/
Turinsky, A., Turner, B., Borja, R., Gleeson, J., Heath, M., Pu, S., Switzer, T., Dong, D., Gong, Y., On, T., Xiong, X., Emili, A., Greenblatt, J., Parkinson, J., Zhang, Z., & Wodak, S. (2010). DAnCER: Disease-Annotated Chromatin Epigenetics Resource Nucleic Acids Research, 39 (Database) DOI: 10.1093/nar/gkq857
Fingerman, I., McDaniel, L., Zhang, X., Ratzat, W., Hassan, T., Jiang, Z., Cohen, R., & Schuler, G. (2010). NCBI Epigenomics: a new public resource for exploring epigenomic data sets Nucleic Acids Research, 39 (Database) DOI: 10.1093/nar/gkq1146



... Read more »

Turinsky, A., Turner, B., Borja, R., Gleeson, J., Heath, M., Pu, S., Switzer, T., Dong, D., Gong, Y., On, T.... (2010) DAnCER: Disease-Annotated Chromatin Epigenetics Resource. Nucleic Acids Research, 39(Database). DOI: 10.1093/nar/gkq857  

Fingerman, I., McDaniel, L., Zhang, X., Ratzat, W., Hassan, T., Jiang, Z., Cohen, R., & Schuler, G. (2010) NCBI Epigenomics: a new public resource for exploring epigenomic data sets. Nucleic Acids Research, 39(Database). DOI: 10.1093/nar/gkq1146  

  • February 16, 2011
  • 09:11 AM
  • 1,677 views

Tip of the Week: Melina II for promoter analysis

by Mary in OpenHelix


One of the most frequently-asked questions we get when we are out doing workshops is: how do I find motifs in promoters, and what can I do with them to find more information? Just last Friday we were asked this again at the workshops we did at USC. So for this week’s tip of the week I’m going to show one of the tools I recommend for that purpose–Melina II.  (I also recommended the MEME Suite and VISTA‘s rVISTA features as well, but for this tip I’ll focus on Melina.)
Melina II is not a new tool, it’s been around for a while. But it’s been one of my favorites because of the way it combines several tools that I would otherwise have to access separately. And I like the graphical representation that it delivers for the motifs that are discovered.
As they say on their homepage, it’s a straightforward 3-step process: put in some sequences, choose motif finders and set parameters, and then run to see your results. And it is just that easy. You can go in and tweak all of the motif finder parameters if you like, but a default setting search will quickly get you started finding motifs from an input set of sequences.
You’ll have a graphical display of the motif location in the sequence panel at the top, but you can click on any of the colored discovered motifs to display the alignments, the sequence logo, or the weight matrix at the bottom. And from there you could also do a couple of searches of other resources as well to locate additional promoters that may carry your motif.
It’s such a quick and slick way to look for motifs it has long been one of my first choices for this kind of analysis. You can access each of the individual motif finder tools at their home sites as well, and there may be more features over there. But to get started this is a very nice choice.
Note: A couple of weeks ago the Melina II tool was down, because of server issues. We talked with the team and although it’s up and running on a backup server, some of the settings aren’t quite right yet. But I still wanted to discuss it because of answering that question from the workshop. So you can try it out, but check back at a later date for the full server with the correct settings.
Find Melina II here: http://melina2.hgc.jp/
Reference:
Okumura, T., Makiguchi, H., Makita, Y., Yamashita, R., & Nakai, K. (2007). Melina II: a web tool for comparisons among several predictive algorithms to find potential motifs from promoter regions Nucleic Acids Research, 35 (Web Server) DOI: 10.1093/nar/gkm362
++++++++++++++++++++++++++++++++
(We have just updated our full tutorial version of the Melina II tool, which is available for individual purchase or by subscription here.)



... Read more »

  • February 2, 2011
  • 09:05 AM
  • 844 views

Tip of the Week: RCSB PDB Data Distribution Summaries

by Jennifer in OpenHelix


In today’s tip I will feature the data distribution summaries and their drill down features which you can see from many RCSB PDB searches. We are in the process of updating our full tutorial sponsored by the RCSB PDB team, and as part of that effort I’ve gotten to know and appreciate this new data presentation format. Over the last five years the RCSB PDB has really been working hard at redesigning their resource to be more easily accessed by a wide variety of users. Below you will find a recent citation from the group explaining all of their updates and the logic behind them. The paper is a good read because I won’t have time to do anything except scratch the surface of the redesign & you’ll get the details there, but also because  the intro also gives a great glimpse into what resources are dealing with in the way of ‘data deluge’. The increase in users AND data that the RCSB PDB has experienced over the last few years is mind boggling!
OK, back to the data distributions. To me these are really elegant ways of helping any user – PDB is by no means just for structural biologists – come to the RCSB PDB & quickly and easily access whole categories of interesting information and then drill down in detailed ways to access the specific structure or data that they are most interested in.  For example, I could begin with a keyword search for something as general as ‘kinase’. This search retrieves over 4 thousand hits, which could be quite daunting, but at the top of the report results are displayed under categories such as Organism, Taxonomy, Experimental Method, SCOP classification and more. Subcategories under each of these categories lets me know how many hits are, for example are a mixed Polymer type, are human hits, or are alpha and beta proteins. I can mouse over any subcategory title to find out the percent of hits in this category compared to all hits, or click on the title to further drill-down the data distribution on just that subcategory of results. The distribution summaries are updated to then focus specifically on the distribution of THOSE data. Using these summaries is much more intuitive than any text description description that I can muster.
My advice? Check out the tip, then check out the data distribution summaries, drill down utility, and all the other great features of the RCSB PDB & see how easy it is to find information on your favorite gene. Oh yea, and be watching for us to release our full, free & newly updated tutorial on the RCSB PDB resource soon!
Rose, P., Beran, B., Bi, C., Bluhm, W., Dimitropoulos, D., Goodsell, D., Prlic, A., Quesada, M., Quinn, G., Westbrook, J., Young, J., Yukich, B., Zardecki, C., Berman, H., & Bourne, P. (2010). The RCSB Protein Data Bank: redesigned web site and web services Nucleic Acids Research, 39 (Database) DOI: 10.1093/nar/gkq1021


... Read more »

Rose, P., Beran, B., Bi, C., Bluhm, W., Dimitropoulos, D., Goodsell, D., Prlic, A., Quesada, M., Quinn, G., Westbrook, J.... (2010) The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Research, 39(Database). DOI: 10.1093/nar/gkq1021  

  • February 1, 2011
  • 02:20 PM
  • 956 views

modENCODE: the data bonanza ensues

by Mary in OpenHelix

Another of the “big data” projects that is underway is the ENCODE project, or Encyclopedia of DNA Elements, to provide comprehensive annotation of genomic elements.  Some people are aware of this and are using the data already. If you aren’t, you should check out the online tutorial, freely available because it is sponsored by the UCSC ENCODE Data Coordination Center (DCC) team, for an overview of the organization and availability of the ENCODE mammal data that you can find in the UCSC Genome Browser. That data is flowing in, and you can start looking at it now.
There’s another branch of ENCODE, though, which is not housed at UCSC, that you should be aware of. There’s also modENCODE. The modENCODE project–as you might guess from the name–is aimed at model organisms. The principles are similar: to explore and analyze all the functional elements of what comprises the genome. But the focus is on model organism species: Drosophila and C. elegans. The data coordination center for modENCODE is handled separately from the mammalian branch, but the groups coordinate and interact in other project arenas.
There’s a marker paper from 2009 that establishes the foundation and the framework for the modENCODE project. But just before Christmas there were 2 papers that came out that provide terrific overviews of the status of the modENCODE projects. There’s one for each organism.
One of the parts that really struck me about the modENCODE features is that they have the opportunity to explore developmental life stages that aren’t possible with the human ENCODE data. As someone who studied developmental biology in the lab, that’s a particularly keen aspect of this for me. So much of what we know about human is adult or cell line data, and there’s so much to learn when you can explore over time in this way. Very neat.
Both papers provide the fairly standard sort of “big data” paper framework: why we did this, what we did, summary statistics for things they analyzed, and some compelling examples of a few sample tidbits. But like all of the big data papers, the real data you might need really isn’t in there. There’s going to be a lot more in the supplement. But mostly you’ll have to go to the DCC databases to browse around and query for items and regions of interest for your work. You should go over to the modENCODE site and start your mining with the modMINE tools.
I just noticed in my twitter feed today though that there’s more you should know about if this project is relevant for your work: there is a special issue of Genome Research that collects the more detailed data papers from the modENCODE projects. (hat tip to @bachinsky for that. PS: this is why I use twitter for work).
I haven’t had time to read the Genome Research papers yet, but I can see they cover the data, methods, and the reagents/resources that are associated with the project. There’s going to be a wealth of stuff over there. Check it all out.
References for modENCODE:
Marker paper 2009:
Celniker, S., Dillon, L., Gerstein, M., Gunsalus, K., Henikoff, S., Karpen, G., Kellis, M., Lai, E., Lieb, J., MacAlpine, D., Micklem, G., Piano, F., Snyder, M., Stein, L., White, K., & Waterston, R. (2009). Unlocking the secrets of the genome Nature, 459 (7249), 927-930 DOI: 10.1038/459927a
New papers 2010:
Gerstein, M., Lu, Z., Van Nostrand, E., Cheng, C., Arshinoff, B., Liu, T., Yip, K., Robilotto, R., Rechtsteiner, A., Ikegami, K., Alves, P., Chateigner, A., Perry, M., Morris, M., Auerbach, R., Feng, X., Leng, J., Vielle, A., Niu, W., Rhrissorrakrai, K., Agarwal, A., Alexander, R., Barber, G., Brdlik, C., Brennan, J., Brouillet, J., Carr, A., Cheung, M., Clawson, H., Contrino, S., Dannenberg, L., Dernburg, A., Desai, A., Dick, L., Dose, A., Du, J., Egelhofer, T., Ercan, S., Euskirchen, G., Ewing, B., Feingold, E., Gassmann, R., Good, P., Green, P., Gullier, F., Gutwein, M., Guyer, M., Habegger, L., Han, T., Henikoff, J., Henz, S., Hinrichs, A., Holster, H., Hyman, T., Iniguez, A., Janette, J., Jensen, M., Kato, M.... Read more »

Celniker, S., Dillon, L., Gerstein, M., Gunsalus, K., Henikoff, S., Karpen, G., Kellis, M., Lai, E., Lieb, J., MacAlpine, D.... (2009) Unlocking the secrets of the genome. Nature, 459(7249), 927-930. DOI: 10.1038/459927a  

Gerstein, M., Lu, Z., Van Nostrand, E., Cheng, C., Arshinoff, B., Liu, T., Yip, K., Robilotto, R., Rechtsteiner, A., Ikegami, K.... (2010) Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project. Science, 330(6012), 1775-1787. DOI: 10.1126/science.1196914  

The modENCODE Consortium., Roy, S., Ernst, J., Kharchenko, P., Kheradpour, P., Negre, N., Eaton, M., Landolin, J., Bristow, C., Ma, L.... (2010) Identification of Functional Elements and Regulatory Circuits by Drosophila modENCODE. Science, 330(6012), 1787-1797. DOI: 10.1126/science.1198374  

  • January 26, 2011
  • 09:15 AM
  • 1,139 views

Tip of the Week: iRefWeb + protein interaction curation

by Mary in OpenHelix

For this week’s tip of the week I’m going to introduce iRefWeb, a resource that provides thousands of data points on protein-protein interactions.  If you follow this blog regularly, you may remember that we had a guest post from the iRefWeb team not too long ago. It was a nice overview of many of the important aspects of this tool, and I won’t go into those again here–you should check that out. Andrei knows those details quite well!
And at the time we also mentioned their webinar was coming up. We were unable to attend that, though, because we were doing workshops at The Stowers Institute. I was delighted to find that their webcast is now available to watch in full. It’s about 40 minutes long and covers much more than my 5-minute appetizer could do.  It details many practical aspects of how to use iRefWeb effectively.
Because they’ve done all the prep work for me, I don’t need to spend much time on the structural and functional features here. What I would like to do is draw your attention to a different aspect of their work. Their project draws together protein interaction data from a variety of source databases–including some of our favorites such as MINT and IntAct (for which we have training suites available for purchase).  They then used the iRefWeb processes and projects to evaluate and consider the issues around curation of protein-protein interaction data, and recently published those results. That’s what I’ll be focusing on in the post.
Every so often a database flame-war erupts in the bioinformatics community. Generally it involves someone writing a review of databases and/or their content. These evaluations are sometimes critical, sometimes not–but often what happens is that the database providers feel that their site is either mis-represented, or unfairly chastised, or at a minimum incompletely detailed on their mission and methods. I remember one  flambé developed not too long ago around a paper by our old friend from our Proteome days–Mike Cusick–and his colleagues (and we talked about that here). As the OpenHelix team has been involved in plenty of software and curation teams, we know how these play out. And we have sympathy for both the authors and the database providers in these situations.
So when the iRefWeb site pointed me to their new paper I thought: oh-oh…shall I wear my asbestos pantsuit for this one???  The title is Literature curation of protein interactions: measuring agreement across major public databases.  Heh–how’s that working out for ya?
Anyway–it turns out not to need protective gear, in my opinion. Because their project brings data from several interaction database sources, they are well-positioned to collect information about the data to compare the data sets. They clearly explain their stringent criteria, and then look at the data from different papers as it is collected across different databases.
A key point is this:
On average, two databases curating the same publication agree on 42% of their interactions. The discrepancies between the sets of proteins annotated from the same publication are typically less pronounced, with the average agreement of 62%, but the overall trend is similar.
So although there is overlap, different database have different data stored. This won’t be a surprise to most of us in bioinformatics. But I think it is something that end users need to understand. The iRefWeb team acknowledges that there are many sources of difference among data curation teams. Some curate only certain species. Some include all data from high-throughput studies, others take only high-confidence subsets of that data. And it’s fine for different teams to slice the data how they want. Users just need to be aware of this.
It seems that in general there’s more agreement between curators on non-vertebrate model organism data sets than there is for vertebrates. Isoform complexity is a major problem among the hairy organisms, it turns out–and this affects how the iRefWeb team scored the data sets. And as always when curation is evaluated–the authors of papers are sometimes found to be at fault for providing some vagueness to their data sets.
The iRefWeb tools offer you a way to assess what’s available from a given paper in a straightforward manner. In their webinar, you can hear them describe that ~30 minutes in. If you use protein-protein interaction data, you should check that out.
Caveat emptor for protein-protein interaction data (well, and all data in databases, really). But iRefWeb provides an indication of what is available and what the sources are–all of it traceable to the original papers.
The paper is a nice awareness of the issues, not specific criticism of any of the sources. They note the importance of the curation standards encouraged by the Proteomics Standards Initiative–Molecular Interaction (PSI-MI) ontologies and efforts. And they use their paper to raise awareness of where there may be dragons. It seems that dragons are quite an issue for multi-protein complex data.
Your mileage may vary. If you are a data provider, you may want to have protective gear for this paper. But as someone not connected directly to any of the projects, I thought it was reasonable. And something to keep in mind as a user of data–especially as more “big data” proteomics projects start rolling out more and more data.
Quick links and References:
iRefWeb http://wodaklab.org/iRefWeb/
Their Webinar: http://www.g-sin.com/home/events/Learn_about_iRefWeb
Turinsky, A., Razick, S., Turner, B., Donaldson, I., & Wodak, S. (2010). Literature curation of protein interactions: measuring agreement across major public databases Database, 2010 DOI: 10.1093/database/baq026
Cusick, M., Yu, H., Smolyar, A., Venkatesan, K., Carvunis, A., Simonis, N., Rual, J., Borick, H., Braun, P., Dreze, M., Vandenhaute, J., Galli, M., Yazaki, J., Hill, D., Ecker, J., Roth, F., & Vidal, M. (2009). Literature-curated protein interaction datasets Nature Methods, 6 (1), 39-46 DOI: 10.1038/nmeth.1284... Read more »

Cusick, M., Yu, H., Smolyar, A., Venkatesan, K., Carvunis, A., Simonis, N., Rual, J., Borick, H., Braun, P., Dreze, M.... (2009) Literature-curated protein interaction datasets. Nature Methods, 6(1), 39-46. DOI: 10.1038/nmeth.1284  

  • January 12, 2011
  • 09:12 AM
  • 893 views

Tip of the Week: Twitter in Bioinformatics

by Mary in OpenHelix


So let’s talk about Twitter. Some voted against it in the poll–but the folks who were interested in seeing how I use Twitter carried the day 80% to 20%. If you were the one of the ones who voted against this–join us next week instead   But if you are wondering how this tool might be of use in this arena, you can watch as I introduce my strategy for gaining useful bioinformatics tidbits from this source.
I suppose if you are very new to this you need some vocabulary. Twitter is the provider of the platform that enables people to send short (140 character) text messages around. The individual messages are referred to as tweets. And although it may not seem like 140 characters is of much utility, there are some really useful things that can be conveyed in that much text.  Here’s a sample tweet from the NCBI. They would be referred to as @NCBI in the twittersphere.
You don’t have to have a twitter account to see tweets from NCBI. You could just go to the NCBI twitter page and look at them. Or you can do what I do–I use another kind of software tool to organize and monitor the tweets I’m interested in.
I use TweetDeck, a free interface for twitter management. I have no relationship with them. It’s just the one I started using. There may be other tools for this other people will recommend–feel free to tell us your favorite. I have it on my desktop, and on my iPhone.
Here’s a sample of my TweetDeck interface (click for larger size). It sits on my desktop, and  sort of trills when a new message comes in. You can change the settings for the notifications if that drives you nuts, though. And if you share an office–turn the sound off.
I have it set up to do a standing search for keywords, and to show me things that people or projects that I am “following” have sent out. I have a column for the keywords genome/genomics/etc, one for bioinformatics, and then the OpenHelix follow set on the right. (I also have my friends and personal tweets off to the left; you’d have to scroll over to see those columns). So items flow in all day that meet my search criteria, or are from professional sources I’m interested in following, and not shown are notes from my friends–some of whom are e-friends and some I know IRL (in real life).
You could stop at this point and just watch things come to you if that’s all you need. Some people, though, may want to share things–items they’ve read and liked, questions for others, announcements they have, etc. For that part you need a Twitter account to send messages.
The message sending part is at the top of the TweetDeck. It tells you how many characters you have left, and turns red if you are over the limit of 140. A couple of things about messages:

Consider sending less than 140 sometimes, if you want people to re-tweet your message. The re-tweet would carry your twitter name and eat up more characters. So retweets of something we send would begin: RT @openhelix…. costing an additional 13 characters for the next person. Sometimes in a re-tweet people will have to edit, and those might be designated as MT or Modified Tweet.
Use a URL shortening service to make small links rather than long ones that use up your characters. I use bit.ly, but others are available. You can set up your TweetDeck to automatically convert them too.
Use a hashtag for key terms–which you can add in addition to your text. For example, you may see examples like #genomics or #bioinformatics in there. #GWAS is a common one. In the movie I show the hashtag for the Pacific Symposium for Biocomputing 2011 conference, #PSB2011. People at the conference can tweet stuff with that tag, and if you have a column set up for that it will feed to that. Or you can use it as a search term. Hashtags have also become a means of joking around on twitter. Such as #PeopleAreInsane, #itswetoutthere, or #Aflockalypse was big around the Arkansas bird death news.

{One quick additional item about links: we have a service that stores all the links we tweet in one place. It’s a handy place to look for things that you know were sent, but can’t remember exactly. Our list is here: http://trunk.ly/openhelix/ }
You may not be convinced that Twitter is of any use for you from this. And that’s fine. But people are using it for networking, outreach, sharing science, asking questions, and many people find it has utility for them. I love to get updates from NCBI, Cytoscape, and other database and software providers about new features and updates. People share cool papers and new useful sites. There really are some gems out there. And we will forward interesting items, or send out information when we have cool blog posts or new tutorials or other sorts of stuff. So we use it for outreach as well.
Twitter use may even contribute to the world knowledge. I found a recent paper in PLoS that analyzed tweets associated with the H1N1 flu epidemic. And they found a lot of good information was being generated. And they were also able to find some that was bad information–peak misinformation when some the cranks at Natural News published an article of some fiction. So it certainly is a mixed bag. (It was the first scientific paper I ever read that referred to Harry Potter, Twitter, and the flu.) PubMed currently has 66 articles that contain the search term “twitter” (one seems to be about actual birds….). I’m sure there will be more.
In a recent interview, Ed Yong the science writer describes his perspective on the use of twitter for his work. As I was writing this up, Ed provided a funny tidbit for a Friday afternoon:

edyong209 Codswallop. You, sir, are a cad and blackguard, wot wot. RT @b0yle : Are you tweeting with an accent? http://on.msnbc.com/gU3Tc6

If you start watching it, you’ll catch on to the etiquette and the common abbreviations–apparently some of the local variants, too. You can look for helpful stuff all around the web.
Twitter is not for everyone. But like most technology–if used properly, you can get something valuable out of it. And the #bioinformatics community is pretty good about using it well, I would say.
Follow us if you want to: http://twitter.com/openhelix Oh–and if you manage the twitter feed for a project or database or type of tool that we’d be interested, drop us a comment and we’ll start following you. It would be neat to link up more of the projects with this mechanism for outreach.
Reference:
Chew, C., & Eysenbach, G. (2010). Pandemics in the Age of Twitter: Content Analysis of... Read more »

  • January 5, 2011
  • 08:45 AM
  • 1,529 views

Tip of the Week: SKIPPY predicting variants w/ splicing affects

by Jennifer in OpenHelix


More and more disease-causing mutations are being identified in exonic splicing regulatory sequences (ESRs). These disease effects  can result from ESR mutations that cause exon skipping in functionally diverse genes. In today’s tip I’d like to introduce you to a tool designed to detect exon variants that modulate splicing. The tool is named SKIPPY and has been developed and is maintained by groups in the Genomic Functional Analysis research section of the NHGRI.
At the end of the post I cite a very well-written paper describing the development of SKIPPY, as well as the background on why the tool was developed. I won’t have time to go into all those details, but if you are interested the paper is freely available from Genome Biology. The site also has nice, clear documentation and example inputs – which I will use as my examples. Splicing can be modulated in a variety of ways, including the loss or gain of exonic splicing enhancers (ESEs) or silencers (ESSs). Variants accomplishing either of those are referred to as splice-affecting genome variants, or SAVs. Not all of the abbreviations are explained on the results page, as you will see in the tip, but all are explained in detail in the SKIPPY publication, and the  ‘Methods and Interpretations‘ and ‘Quick Reference and Tutorial‘ areas of the site.
I first found the tool because it was mentioned in a nice review entitled “Using Bioinformatics to predict the functional impact of SNVs“, which is a paper that reviews mechanisms by which point mutations can effect function, describes many of the algorithms and resources available & provides some sage advice. I’ll post more on it in a later post. For now, check out the tip & the SKIPPY resource, and if you use the site please let us know what you think.
Woolfe, A., Mullikin, J., & Elnitski, L. (2010). Genomic features defining exonic variants that modulate splicing Genome Biology, 11 (2) DOI: 10.1186/gb-2010-11-2-r20
Cline, M., & Karchin, R. (2010). Using bioinformatics to predict the functional impact of SNVs Bioinformatics DOI: 10.1093/bioinformatics/btq695


... Read more »

  • January 4, 2011
  • 10:51 AM
  • 1,205 views

NAR Database issue…get it while it’s hot!

by Mary in OpenHelix

Ok, it’s hot now–but it’s something we refer back to all year long, actually. For people who don’t know about the NAR Database Issue, since the mid-90s Nucleic Acids Research has been collecting bioinformatics databases and tools that are of use to a huge range of researchers. We’ve watched it grow over the years and we’ve even graphed it. We’ll have to update that graph with the new data point for this year.  But here’s the graph as we published it last year:
(You can get this figure from our paper here, it is Figure 1)
You can see steady growth in the resources collected in the NAR set. But that’s certainly not all of them–others can be found in their server issue in the summer, and some just aren’t listed in a lot of places. We think there are in the range of 3000 tools and resources of some sort around.
A nice overview of the state of play is always provided in the introduction paper for that issue. As they state, this year we are up to 1330 data sources in their list. And they also highlight a couple of editorials that address important issues in this arena. One is about the need for data sources to talk to each other. This is an important point:
these databases risk functioning increasingly as isolated islands in a sea of disparate biological data
And there’s another editorial that speaks to the understanding of the data we have in our hands–and the need to understand it better. It describes COMBREX–a very cool effort:
This project is designed to serve as a clearinghouse, collecting functional predictions from specialists in bioinformatics and functional genomics and then sending these predictions for testing by experimentalists.
This is the kind of thing that makes me wish I still had a lab. There’s so much opportunity here…alas. The road not taken. But a hot opportunity for smart youngsters who might like to carve out a niche with a lab that mines the computational materials and pairs it with great projects for students to do the bench characterizations. And it offers grants to do this work….
Anyway–check out the NAR database issue. It’s worth your time. Really.
References:
Williams, J., Mangan, M., Perreault-Micale, C., Lathe, S., Sirohi, N., & Lathe, W. (2010). OpenHelix: bioinformatics education outside of a different box Briefings in Bioinformatics, 11 (6), 598-609 DOI: 10.1093/bib/bbq026
Galperin, M., & Cochrane, G. (2010). The 2011 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection Nucleic Acids Research, 39 (Database) DOI: 10.1093/nar/gkq1243
Gaudet, P., Bairoch, A., Field, D., Sansone, S., Taylor, C., Attwood, T., Bateman, A., Blake, J., Bult, C., Cherry, J., Chisholm, R., Cochrane, G., Cook, C., Eppig, J., Galperin, M., Gentleman, R., Goble, C., Gojobori, T., Hancock, J., Howe, D., Imanishi, T., Kelso, J., Landsman, D., Lewis, S., Mizrachi, I., Orchard, S., Ouellette, B., Ranganathan, S., Richardson, L., Rocca-Serra, P., Schofield, P., Smedley, D., Southan, C., Tan, T., Tatusova, T., Whetzel, P., White, O., Yamasaki, C., & , . (2010). Towards BioDBcore: a community-defined information specification for biological databases Nucleic Acids Research, 39 (Database) DOI: 10.1093/nar/gkq1173
Roberts, R., Chang, Y., Hu, Z., Rachlin, J., Anton, B., Pokrzywa, R., Choi, H., Faller, L., Guleria, J., Housman, G., Klitgord, N., Mazumdar, V., McGettrick, M., Osmani, L., Swaminathan, R., Tao, K., Letovsky, S., Vitkup, D., Segre, D., Salzberg, S., Delisi, C., Steffen, M., & Kasif, S. (2010). COMBREX: a project to accelerate the functional annotation of prokaryotic genomes Nucleic Acids Research, 39 (Database) DOI: 10.1093/nar/gkq1168


... Read more »

Williams, J., Mangan, M., Perreault-Micale, C., Lathe, S., Sirohi, N., & Lathe, W. (2010) OpenHelix: bioinformatics education outside of a different box. Briefings in Bioinformatics, 11(6), 598-609. DOI: 10.1093/bib/bbq026  

Gaudet, P., Bairoch, A., Field, D., Sansone, S., Taylor, C., Attwood, T., Bateman, A., Blake, J., Bult, C., Cherry, J.... (2010) Towards BioDBcore: a community-defined information specification for biological databases. Nucleic Acids Research, 39(Database). DOI: 10.1093/nar/gkq1173  

Roberts, R., Chang, Y., Hu, Z., Rachlin, J., Anton, B., Pokrzywa, R., Choi, H., Faller, L., Guleria, J., Housman, G.... (2010) COMBREX: a project to accelerate the functional annotation of prokaryotic genomes. Nucleic Acids Research, 39(Database). DOI: 10.1093/nar/gkq1168  

  • December 15, 2010
  • 01:34 AM
  • 1,630 views

Tip of the Week: RepTar, a database of miRNA target sites

by Trey in OpenHelix

microRNAs have become a rich source of research as they probably have a huge effect on gene expression and disease. The human genome may encode over 1,000 miRNAs that target over half of our genes. They might be implicated in a lot of common diseases (which not yet have been picked up in GWAS studies?). They are a fascinating area of biology that has only come of it’s on in the last decade. As such, the number of databases to catalog miRNAs is large. Today’s tip is on a new one, RepTar, which is reported in the upcoming NAR database issue. The niche RepTar is attempting to fill is to get predictions of miRNAs more comprehensive by including new research in the algorithm. This new research suggests there are more possible target sites than previously thought. As mentioned in the article,

Recently, the miRNA binding options were expanded further with the identification of ‘centered sites’, functional miRNA target sites that lack both perfect seed pairing and 3′-compensatory pairing and instead exhibit pairing with the target along 11–12 contiguous pairs at the center of the miRNA (4). While some algorithms relaxed the evolutionary conservation criterion (5–11) and/or offer also predictions of 3′-compensatory sites [e.g. (6,12,13)], few databases offer predictions of the whole repertoire of miRNA targeting patterns. Furthermore to date, no database lists genome-wide prediction of cellular targets of viral miRNAs. These miRNAs lack significant evolutionary conservation and their targets are not necessarily expected to be evolutionarily conserved. In addition, the few identified viral miRNA targets have shown both conventional seed binding and 3′-compensatory binding [e.g. (3,14)].
Here we present a database of genome-wide miRNA target predictions for mouse and human genes, based on the predictions of our novel target prediction algorithm, RepTar

I’ll leave the predictive value up to miRNA researchers, but I thought I’d introduce the site.
While I’m at it, allow me to list a few other miRNA sites from labs and institutes as far flung as China, Italy, Israel, Canada and the U.S.. Perhaps someday I’ll do a comparison.
CircuitsDB, which Jennifer did a great tip of the week tutorial on.
miRBase, which we have a full-length tutorial on.
microRNA.org
HMDD
miRDB
tarBase
miRecords:
PicTar, they have an annotation track for UCSC Genome Browser
miRNA2Disease
PuTmiR (in relation to transcription factors)
microRNAdb:
two lists to catch some others: http://mirnablog.com/microrna-target-prediction-tools/ and  http://www.ncrna.org/KnowledgeBase/link-database/mirna_target_database
Elefant, N., Berger, A., Shein, H., Hofree, M., Margalit, H., & Altuvia, Y. (2010). RepTar: a database of predicted cellular targets of host and viral miRNAs Nucleic Acids Research DOI: 10.1093/nar/gkq1233


... Read more »

Elefant, N., Berger, A., Shein, H., Hofree, M., Margalit, H., & Altuvia, Y. (2010) RepTar: a database of predicted cellular targets of host and viral miRNAs. Nucleic Acids Research. DOI: 10.1093/nar/gkq1233  

  • December 13, 2010
  • 10:05 AM
  • 632 views

DNA Deniers

by Mary in OpenHelix

Wha?
From Michael Pollan:
"How the gene-disease paradigm appears to be collapsing. Why aren't we hearing about this?! http://p2.to/14XB"

Wha?
Michael Pollan and his flock became all aerated the other day when Michael tweeted this tidbit. It links to a story with quite the title:
The Great DNA Data Deficit: Are Genes for Disease a Mirage?
Srsly. That’s what it says.
What do I think this is? The second case of gene denialism that I have observed. (The first was a group disputing autism genes.)
I knew that after the genome came along there would be woo. I knew that snake-oil salesmen would be pitching purchases that would work with your skin genes. I knew there would be anti-aging compounds that work with your genes. I know there’s already a DNA diet, and vitamins sold to you based on your DNA. I’ve seen DNA dating. But honestly, I didn’t expect the DNA deniers.
Probably I should have seen it coming. I’ve followed a couple of different topics that flow with anti-science woo: anti-vaxxers and anti-GMOsters. There is overlap between these groups, but it’s not complete. But there is remarkable coincidence between their argument styles. Both groups make big claims, mostly unsourced–or if sourced are cherry-picked points or entirely misused. And when the science isn’t going their way, they deny the science, and then they move the goal posts.
This article–which purportedly blows away the connection between genes and disease–is appallingly mistaken. Let me be clear: genes can influence disease risk. Period. Of course environment may influence biology. Diet and exercise can affect health, certainly. Exposure to natural or man-made carcinogens can trigger cancer. And even the hardest core gene jocks know this.  But this desire to sever the connection between genes and diabetes–or prostate cancer, or Crohn’s disease–because they haven’t found a single smoking gun gene yet, using one kind of study?  That’s just bizarre and twisted. There are numerous examples of leads on complex disorders that are quite strong, insights into disease pathways and mechanisms, and we’ve really just started. And new technologies are opening new paths as well. A nice article on this was in Nature this fall: Genomics: The search for association.

Sure, we’ve wanted more data and stronger signals from GWAS (genome-wide association studies). But it turns out humans are inveterate outbreeders and it’s hard to tease out strong pointers from them.
Probably if you are a regular reader of this blog I don’t need to convince you of that. But for anyone else who stumbles across this let me offer some resources:

Genetics Home Reference


OMIM


Genetic Alliance


NHGRI GWAS Catalog

What I can’t quite figure out is why the authors of that post attempt to discredit all the work and all of the discoveries that have been made so far–and those we are going to unearth. As a relatively new strategy, and as we refine the tools, the populations study groups, and build on new knowledge, we are going to find more. And much had been done–check out the GWAS Catalog for an overview. Scroll down. And keep scrolling. As it says on that page: “As of 12/09/10, this table includes 725 publications and 3606 SNPs.”
And I also can’t figure out why Pollan’s minions are celebrating this. Here are samples of the responses to this:



If this was true (which it certainly isn’t), why would this be a reason to chuckle? Why celebrate? I honestly don’t get it. The actual emotion ought to be embarrassment for the credulity.
But ok, you aren’t down with the human GWAS data right now–let’s look at other GWAS and see what’s coming out of that. There have been some really stunning examples of these studies in dogs. There was a talk a couple of years back that we watched: Genes for Complex Traits in the Domestic Dog. You can watch that online to learn more. An advantage of working with dogs is that they are highly inbred. A professor of mine in grad school once snarked that we can’t do that with humans–although that was part of the purpose of the Ivy League, he claimed–a couple of hundred years of intensive breeding and good pedigree records make gene hunting in the canine genome somewhat easier than it is in the messy human populations. Here’s another recent article on dog traits in Nature.  If you think that genes don’t cause complex disorders, you have to dispute that some dog breeds are prone to anxiety due to their genes. And that some are prone to deafness–it’s clearly the Dalmatian lifestyle, right? Or that dobermans are bringing their narcolepsy on themselves somehow. [Seriously--they are narcoleptic? Who knew....]
Clearly the authors have an agenda. At the end they make their case:
Nevertheless, most governments cooperate far more, for example, with their food industries than with those who wish to eat a healthy diet. The laying to rest of genetic determinism for disease, however, provides an opportunity to shift this cynical political calculus. It raises the stakes by confronting policy-makers as never before with the fact that they have every opportunity, through promoting food labeling, taxing junk food, or funding unbiased research, to help their electorates make enormously positive lifestyle choices.
Author Latham goes further at HuffPo (home to mucho woo of many stripes) emphasis mine:

­That means environment must be the entire cause of ill health, i.e. junk food, pollution, lack of exercise, etc. The reason we wrote an article about human genetics (when we are a food and agricultur­e website) is that we believe that if people live right, agricultur­e and therefore the planet will more or less fix itself.

I don’t care if you want to discredit the food industry and if you hate Big Ag and want to say so on your own blog. But misusing and discrediting science and the efforts of scientists that have nothing to do with that is a stupid and flawed strategy. And Michael Pollan: please use better judgment before hitching your agenda to deniers.

This piece of tripe one of those sorts of sciency-ness things that Mike the Mad Biologist once hailed as having The Asymmetric Advantage of Bullsh-t.  It has multiple levels of crap. And there isn’t a comment feature on it, so you can’t discuss it over at their site. I will look for other responses to this item and collect them here if I find them, or add them in the comments if you have them.  Anyway, I’m sure someone will take on the #FAIL in other parts of that post–there are plenty of opportunities. I wanted to address the denial aspect.  I agree with Deanna–wow–and I’d love to see a good Fisking by Genomes Unzipped–and it may be coming.
Top tweet on this so far goes to @emmecola:
... Read more »

Baker, M. (2010) Genomics: The search for association. Nature, 467(7319), 1135-1138. DOI: 10.1038/4671135a  

Cyranoski, D. (2010) Genetics: Pet project. Nature, 466(7310), 1036-1038. DOI: 10.1038/4661036a  

  • December 8, 2010
  • 09:07 AM
  • 1,258 views

Tip of the Week: BioGPS for expression data and more

by Mary in OpenHelix


This week’s tip introduces BioGPS, or Gene Portal System. We get a lot of questions about two things that BioGPS can help you to tackle: what do I do with a list of genes to find out what are? And the next question people have after that is: and where are they expressed? BioGPS can help you with both of those problems. It is a tool that integrates and displays many types of data that researchers would be interested in. It also allows you to customize your display with the types of data that are most relevant to you–using their extensive plug-in collection. And it can do so from your browser, or access the basic portal from your iPhone!
Recently there was a question at BioStar about ways to quickly access some human gene expression data. The top rated answer over there was BioGPS, so we thought we’d provide a look at the kinds of things available to users via BioGPS. This 5-minute movie introduces some of the features.
Basically you can search for a gene or a list of genes, you can search with various types of IDs, you can search by keyword, or you can even search by genomic intervals. Your resulting list will quickly link you to all kinds of information from expression data, to annotation details and wikis, and more.  The results are provided in a handy default view with panels of information which may offer what you are looking for.
But you can go further with BioGPS using their customization and plug-in features. You aren’t tied to the default view. The system offers plug-ins: other tools can pipe their information over to BioGPS so you can use it within that framework. You can  register/create a login and then store views that are suited to your research needs.
At the time they wrote the paper provided below, they already had over 150 plug-ins available. As I write this today there are nearly 400 things you could bring in to supplement the views of the genes you are interested in. And the range of plug-ins is tremendous: interaction data sets, SNPs, phylogenetic data… The Figure 2 in their paper gives a partial list of the plug-ins at that time, and the categories they highlight include: literature searching (such as PubMed, iHop, patents, more), gene portals (such as Entrez Gene, UniProt, Gene Cards, more), genetics (dbSNP, HapMap, HuGE, more), pathway tools (KEGG, Reactome, STRING, more) and even reagent providers. But there are more now, and it looks like more will continue to be developed and added. It really depends on what you need and want to display for your searches. You can browse around or search the plug-in collection to explore what’s available to view.
There are other tools you can use to explore expression data specifically. We like the UCSC Gene Sorter for some types of queries. Of course the large repositories of GEO and ArrayExpress can offer expression data as well. But for some users the BioGPS portal may offer integration and customization features that will suit their research needs. Go over and check it out. Register, set up some views, and you’ll be finding all sorts of useful annotations for your genes or regions of interest.
Just to also quickly mention: you can do searches from your lab bench, or from seminars, with the iPhone version of BioGPS as well. I didn’t have time to cover that in the movie but there’s more information over at their site about the tool. I’ve got mine installed and I’ve found it handy during talks!
Quick links:
BioGPS homepage: http://biogps.gnf.org/
BioGPS iPhone app: http://biogps.gnf.org/iphone/
Reference:
Wu, C., Orozco, C., Boyer, J., Leglise, M., Goodale, J., Batalov, S., Hodge, C., Haase, J., Janes, J., Huss, J., & Su, A. (2009). BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources Genome Biology, 10 (11) DOI: 10.1186/gb-2009-10-11-r130


... Read more »

Wu, C., Orozco, C., Boyer, J., Leglise, M., Goodale, J., Batalov, S., Hodge, C., Haase, J., Janes, J., Huss, J.... (2009) BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome Biology, 10(11). DOI: 10.1186/gb-2009-10-11-r130  

  • December 6, 2010
  • 08:00 AM
  • 1,391 views

non-Traditional family structures and genomics

by Trey in OpenHelix

As I and my family await our 23andme kit to scan our genomes, family history has come back to the forefront of my thoughts. I used to be very fascinated by my own genealogy, and with adopted children, the concepts of descent, biology and culture have taken adjusted meanings for me. It’s why we have a ‘family map’ instead of a ‘family tree’. The difference between our cultural genealogy and our genetic genealogy has been become quite clear to me. Obtaining our family ancestry through these tests will bring a lot of these issues back to focus.
But there is a specific issue that is directly related to genomics, genomics tools and my family: same-gender headed household representation in pedigree and genealogy software. It’s non-existent or takes a difficult workaround to make it happen.
With the rising use of personal genomics data, there is a corresponding rise in the use of pedigree software for medical purposes and genealogy software for family history purposes. Neither of these handle non-traditional family structures well. I use ‘non-traditional’ lightly here though because even though same-gender headed households might be relatively new as a recognized family structure, the concept of family can be quite fluid across time and cultures. What is traditional and considered the ‘norm’ today in US culture (nuclear families of  two genders with children born to them) for ‘family’, is obviously not the case in the past, nor in contemporary cultures in other parts of the world.
A paper published last year entitled When Family Means More (or less) than Genetics by Burns and Edwards focuses on this inability of current tools to model family histories that aren’t within this norm. As they state:
One challenge in using family history as a health technology is that the geneticist or clinician defines family based on biology, whereas individuals often include those linked socially.
Genetic heritage and history is indeed important in determining disease susceptibilities, but ignoring or misunderstanding socially-defined kinship can lead to misdiagnosis, the lack of understanding of environmental influences and worse. Tools for modeling pedigrees must be able to flexibly model these family structures in order to be useful.
The researchers look at two groups and conclude that current tools are inadequate to model their family structures. Samoans were one group (Japanese-Americans the other):
When Samoan American participants were asked, “tell me about your family,” persons fulfilling social roles were described by that relationship. For example, an individual raised as a brother was identified as a brother whether or not there was a biological basis to the relationship. Similarly, individuals adopted in to or out of a family were described as the children of the family in which they were raised, not as offspring of the biological family. When further questioned, the participants could identify the biological link. But even when the biological relationship was known, the Samoan Americans reported family relationships based on social rather than biological ties.
They go in to good detail into why this is a problem. They also, early in the paper, suggest modern American society is changing. Americans already are one of the most ‘adopting’ nations in the world. And, as the authors note, our family structures are becoming more fluid (perhaps converging with Samoan concepts in some ways?):
For example, the Western postmodern family has looser kinship ties than in the past, with relationships that are diverse and fluid (Stacey, 1998). Blended, adoptive, and gay families, as well as those resulting from a variety of assisted reproductive technologies, place an emphasis on choice rather than genetics. For many, family is about social relationships and not solely concerned with the transfer of genes from one generation to the next (Finkler, 2001;Lévi-Strauss, 1969; Peletz, 1995). Nonbiological social factors, such as role behavior, determine family membership, so that a mother’s sister’s son who has been raised with you is your brother (Finkler, 2001). Both formal and informal adoptions are traditional practices and very common in certain societies: Polynesia often being presented as the exemplar (Brady, 1976; Carroll, 1970; Levy, 1973).
So, let me side step adoption or other non-genetic descent issues for a moment, and hone in on gay families and representation in current pedigree tools available. Though the Recommendations for Standardized Human Pedigree Nomenclature (pdf)  mentions it in passing (“For example, information that is commonly recorded on a pedigree (e.g., same-sex relationships…)”) there is no standard suggested. In my and my colleague’s research so far we have yet to find a software or online medical pedigree tool that easily accepts same-gender parental groups, or represents them well.
I took at one excellent online tool, Madeline 2.0. If one enters a parent, entering a second parent automatically forces an opposite gender. Though there is the ability to model adoptive relationships, there is yet no way to model same-gender couples. I wrote the developers of the tool and received a thoughtful reply. No, there was ability to do this, but considering adopt-in and adopt-out relationships are model, it would make sense to include same-gender couples. They suggested they indeed will consider implementing this. Of course, as with all software and online tools, funding, timing and priorities I know will be an issue. I’ll definitely will keep an eye on developments. So as to not single Madeline out, no other tools that we know of (see here, here and here) allow for same-gender couples or headed families.
When going to family history modeling software for genealogy, the omission is as stark. Every individual has two family trees: a cultural/historical one and a genetic one. For most individuals, those histories overlap. The culture you received from your parents and they from theirs is pretty close to the genetic descent. Even then, its not a perfect overlap. What is important to who you are from a cultural or historical perspective might not at all be related to who are you from a genetic one, and who you are is as much cultural as it is genetic. I am as interested in where I got my cultural ancestry as where I got my genetic one, this has become quite clear to me as we’ve adopted children.
And in the future, descendants will look at their family genealogies and it will be very important to them that one of their ancestors was raised by two men, or two women whether adopted or biological from one parent. As these genealogies are built, those relationships which are very important to their family culture and histories should be represented. I know I personally will hope that this will be the case for our family history in the years to follow.
Yet, for software available it is impossible, a complicated workaround or awkward to allow for same-gender parents in the representation (not to mention paper family trees!). GEDCOM is the defacto standard for exchanging genealogical information. There is no simple standard in GEDCOM for including same-sex parents. That it was developed by the Mormon Church probably has something to do with that ‘oversight’, but frankly given the oversight across the board in pedigree and genealogy standards and software, I doubt that was a deliberate one.
So far I have found software that requires complicated workarounds, like Legacy, or it’s not easy to figure out (though once you do, it’s simple . Of the many I’ve tried, none even allow it.
In a world where the number of same-sex couples is increasing annually (not to mention adoption, blended families and many other types of structures) and increased interest in family history through both genomics and culture and history, I look forward to seeing the software catch up to the ability to model my family for future researchers and historians.
... Read more »

  • December 4, 2010
  • 02:56 PM
  • 1,250 views

Compelling Research on Writer’s Block, applicable to blogs?

by Mary in OpenHelix


References:
Upper, D. (1974). The unsuccessful self-treatment of a case of “writer’s block”1 Journal of Applied Behavior Analysis, 7 (3), 497-497 DOI: 10.1901/jaba.1974.7-497a
Didden, R., Sigafoos, J., O’Reilly, M., Lancioni, G., & Sturmey, P. (2007). A Multisite Cross-Cultural Replication of Upper’s (1974) Unsuccessful Self-Treatment of Writer’s Block Journal of Applied Behavior Analysis, 40 (4), 773-773 DOI: 10.1901/jaba.2007.773 Note: Must see the PDF version for full appreciation of this second article.



... Read more »

join us!

Do you write about peer-reviewed research in your blog? Use ResearchBlogging.org to make it easy for your readers — and others from around the world — to find your serious posts about academic research.

If you don't have a blog, you can still use our site to learn about fascinating developments in cutting-edge research from around the world.

Register Now

Research Blogging is powered by SMG Technology.

To learn more, visit seedmediagroup.com.