GenoPro Home
GenoPro Home  |  Get Started With My Genealogy Tree  |  Buy  |  Login  |  Privacy  |  Search  |  Site Map
 
GEDCOM variations?


https://support.genopro.com/Topic39144.aspx
Print Topic | Close Window

By GedcomQuest - Tuesday, January 1, 2019
I am the author of several programs that process GEDCOM files. Some users have sent me GEDCOM files exported from GenoPro. The HEAD record from one of the files looks like this:
0 HEAD
1 SOUR GenoPro
2 NAME GenoPro® - Picture Your Family Tree!(TM)
2 VERS 3.0.1.1
2 CORP GenoPro
2 ADDR http://www.genopro.com
1 DATE 1 DEC 2018
1 CHAR UTF-8
1 GEDC
2 VERS 5.5
2 FORM LINAGE-LINKED
0 GLOBAL
...

That file includes several non-standard GEDCOM records, as do other similar files where the SOUR value is "GenoPro".

While doing some research on this, I discovered that there is a plug-in (or skin?) for the GenoPro report writer that also creates GEDCOM files. I found a GEDCOM file online that may be written by that plug-in/skin, I am not sure. It's HEAD record looks like this:
0 HEAD
1 SOUR GenoPro®
2 VERS 2.0.1.3/2007.02.28
2 CORP GenoPro Inc.
3 ADDR http://www.genopro.com
1 DATE 3 DEC 2007
1 SUBM @subm1@
1 GEDC
2 VERS 5.5
2 FORM LINEAGE_LINKED
1 CHAR UTF-8
0 @ind01542@ INDI
...

The example above is from 2007 so it may not be of any use. The HEAD record has some obvious differences, such as the SOUR value: "GenoPro" (ex 1) versus "GenoPro®" (ex 2). I assume the "Â" character is the result of some character encoding issue, perhaps caused by the way the files were handled before they got to me. The HEAD record for example 2 is valid except for the "_" in "LINEAGE_LINKED", which should be "-", i.e., "LINEAGE-LINKED".

So, my questions:

1 - Do GenoPro users use the plug-in/skin to export GEDCOM files from GenoPro?

2 - What is the SOUR value written by the plug-in? I'd like to use the SOUR value to distinguish between the two GenoPro GEDCOM types, but I am not confident I have a valid example file from the plug-in.

Thanks for any help you can offer!

John
By genome - Tuesday, January 1, 2019
Hi John,
As you are aware, GenoPro has built-in 'Export to Gedcom' facility but unfortunately produces a completely non-standard file that is next to useless for data transfer!   Its only úse' is to re-import the data into GenoPro with out loss of information.

However GenoPro also has a Report Generator facility that runs a scripting engine allowing users to generate their own reports via 'Skin Templates'.

I am along time user of GenoPro and have developed  as a hobby a number of Report 'Skins', many of which are now included with the GenoPro product.

Many years ago I developed the Export to Gedcom skin using JScript and GenoPro's Report Generator API (www.genopro.com/sdk) to produce a standard Gedcom file that contains as much as possible of the GenoPro data. Many users use this and I certainly strongly recommend its use over the built-in export.

In your examples it looks as though they have been saved with ANSI encoding but both exports use UTF8 encoding.

One way to distinguish between them is the '/' character in the VERS tag.  my Report Skin has [GenoPro version]/[date of Report Skin version] whilst the built-in version will only have the GenoPro version.

Thanks for pointing out the error in the FORM tag I'll sort that.

If you receive files from the builtin version I suggest you inform the submitter to use the Report Skin version instead.

Happy to give any further help but you can download GenoPro for free and experiment with small samples (up to 25 individuals) and save them without a licence. The download will include the Export to Gedcom via Report Generator skin.

GenoPro is not that clever at importing Gedcom data either so at present I am writing a 'web app' in HTML5 / Javascript and utilising some excellent Javascript libraries to provide an alternative import.

Happy New Year!

Ron
By NiKo - Wednesday, January 2, 2019
I use a GEDCOM file from Genopro to import a family tree into Genome MatePro (www.getgmp.com) for Autosomal DNA Analysis.

Page 125 of the latest GMP Manual (2018-09-09 version) says:
"A Gedcom in 5.5 format using UTF-8 is needed (5.5.1 has been known to cause issues). If you are using Legacy Family Tree, select the “Generic” format."



I've tried both versions of the GEDCOM available on GenoPro, and only the Report Generator version works for transfer to GMP.  I'm not sure if it uses the 5.5 or the 5.5.1 format.
By GedcomQuest - Wednesday, January 2, 2019
Ron,

Thanks for your prompt and detailed reply. I will amend my programs to use the "/" in the HEAD.SOUR.VERS value to distinguish between the two GenoPro GEDCOM formats. I will also experiment with the trial version of GenoPro.

John
By GedcomQuest - Friday, January 4, 2019
Ron,

I installed the trial version of GenoPro and created a very small test dataset.

The standard export creates a GEDCOM file with "1 SOUR GenoPro" whereas exporting via the report generator yields "1 SOUR GenoPro®". Rather than use the "/" in the version number, I'll use the difference in the SOUR values to distinguish between the two variations ("®" means from report generator).

When I exported from the report generator, the file included a blank line after the "2 VERS ..." record:
0 HEAD
1 SOUR GenoPro®
2 VERS 3.0.1.4/2015.02.01

2 CORP GenoPro Inc.
3 ADDR http://www.genopro.com
1 DATE 4 JAN 2019
1 SUBM @subm1@
1 GEDC
2 VERS 5.5
2 FORM LINEAGE_LINKED
1 CHAR UTF-8
0 @ind00001@ INDI
Most readers will ignore the empty record and issue an error message. However, some programs are more sensitive to issues in the HEAD record than elsewhere in the file, so it's best to resolve HEAD record issues.

I was a little surprised to see that "GEDCOM Validator" by Chronoplex was flummoxed by the empty line; it failed to process the HEAD records, reported that "GEDCOM version '2.1' is not supported", and gave up. When I removed the empty line, it processed the file and reported these issues:
Info: The file will be processed as a GEDCOM 5.5 file using illegal encoding 'UTF-8'.
Error: 'UTF-8' encoding is only valid for GEDCOM 5.5.1 and later.
Warning: The length of the <MULTIMEDIA_FILE_REFERENCE> is limited to 30 code units but this is too short for most file paths.

I am not aware of any programs that adhere to the <MULTIMEDIA_FILE_REFERENCE> limit; it's ridiculous.

I was surprised the Chronoplex validator did not mention that "LINEAGE_LINKED" was invalid. I changed the value to "LINEAGE-LINKED" and it accepted that, too. I double-checked the 5.5 and 5.5.1 specs, and only "LINEAGE-LINKED" is valid. I specified a completely bogus value, and it reported that as an error. So, I think the Chronoplex validator is accepting an illegal value there. It's pretty solid overall, but it's just software and so it has issues.

I also checked the original file (with the empty line) using the online GEDCOM validator at http://ged-inline.elasticbeanstalk.com/validate.
*** Line 3: Invalid content for VERS tag: '3.0.1.4/2015.02.01' is more than 15 characters, the maximum length for <VERSION_NUMBER>
*** Line 10: Note that the de facto standard GEDCOM version is version 5.5.1
*** Line 12: Invalid content for CHAR tag: 'UTF-8' is not a valid <CHARACTER_SET>
*** Line 25: Invalid content for FILE tag: 'C:\Exhibits\1963-00-00-Cardinal,Peter-5140-hs.jpg' is more than 30 characters, the maximum length for <MULTIMEDIA_FILE_REFERENCE>

So, the ged-online validator ignored the blank line but reported another issue in addition to the ones reported by Chronplex: <VERSION_NUMBER> is too long. I've seen plenty of software with long VERS values though I think the trend is to stick to version numbers only and use shorter values. So, for example, FTM used to have "1 SOUR FTM, 2 VERS Family Tree Maker (21.0.0.723)", but now they have "1 SOUR FTM, 2 VERS 23.1.0.1480".

Technically, using UTF-8 is also invalid with GEDCOM 5.5, but plenty of other programs also do that.

The main issue is the empty line. Perhaps I did something to trigger it. If so, I am not sure what. I looked at the report generator options and I didn't see anything. If you are going to change the "LINEAGE_LINKED" to "LINEAGE-LINKED", perhaps you can investigate the empty line and fix that (if necessary), too.

John
By genome - Friday, January 4, 2019
Thanks for reporting back your findings John.

I have found where the blank line is coming from and will remove it. I will also make Gedcom version 5.5.1 the default so as to cover UTF-8 and allow automatic inclusion of extra tags, e.g. LANG & LATI

best wishes,

Ron
By GedcomQuest - Friday, January 4, 2019
Ron,

Sounds good.

If you contact me via PM on this forum and include your email address, I'll send you a license for my primary GEDCOM-based product.

John
By NiKo - Monday, January 7, 2019
Hi Ron,

I see you plan to change the version number from 5.5 to 5.5.1 for the GEDCOM report.  I'm not sure if this is going to create problems for me importing a GEDCOM file into Genome MatePro.  See the outlined section below.

http://support.genopro.com/Uploads/Images/b741187f-86b3-4df5-a99b-4798.png




What are my options here?  Do I generate the GEDCOM file and then go in and edit the VERS from 5.5.1 to 5.5, or something else?

Are there any other changes you plan to make to bring it up to the 5.5.1 spec?

Thanks,

Nick
By genome - Monday, January 7, 2019
The Gedcom Export is already using version 5.5.1 of the spec, because it has UTF-8 encoding, but as John pointed out, the header is technically invalid as it stated version 5.5, so this amendment is to correct that mistake.

There has always been the option via the Configuration Parameters dialogue to include some of the newer 5.5.1 tags, i.e. WWW, EMAIL, LATI & LONG.  

Now when that option is set (as it will be by default) then the header version will reflect that, i.e. will set to 5.5.1.  If you uncheck the Use 5.5.1 Tags option then the header will show 5.5 but encoding (CHAR) will still be UTF-8, maintaining the status quo.
By Erhardt Stiefel - Wednesday, January 9, 2019
Hello Gurus, Masters an other experts,
I have read your discussion with great interest - but understood just a little.
Now I'm confused and don't know what I have to do to improve my GEDCOM Export files to be readable in other programs.
Will you please give an simple advice where to change what to a normal GenoPro user?
Thanks,
Erhardt (using GenoPro since 13 years)
By genome - Wednesday, January 9, 2019
I have attached a zip with the updated Report skin for export to gedcom.  Hopefully I can get it included in the GenoPro download soon.

But for now just download and unzip in place of the existing {EN} Export to Gedcom folder below your report skins folder. 

You can find the location of your report skins folder via the Options tab of the GenoPro Tools/Generate Report dialogue.

The default configuration options should provide the best Gedcom file for data interchange.
By maru-san - Tuesday, January 15, 2019
After a few successful exports of a gedcom file, received an error message as follows:

Generating report to 'C:\Users\..\Documents\GenoPro Reports\gedcom\'
Cloning document gw_2019...
Opening configuration file Config.xml for skin '\{EN} Export to Gedcom\*  (2019.01.02)'...
Loading Dictionary.xml...
[0.00] Processing template 'Gedcom.js'...
To enable display of parameter settings, untick the box under 'Options' tab of this dialog.
Base skin version 2019.01.02
Error at line 216, position 2 (Code/Utils.js): 'strKey' is undefined
    Microsoft JScript runtime error 800A1391

What could be the reason for this? I added additional data in between.

regards
By genome - Wednesday, January 16, 2019
That error has been lying dormant and undiscovered for a while, as Code/Util.js has not changed for over 4 years.  The script is failing when trying to output an error message, possibly concerning custom markup in your .gno.

I found two issues, a typo in a variable name and missing error text.  The amended skin attached should resolve this issue.


By maru-san - Wednesday, January 16, 2019
Thanks!!
By NiKo - Thursday, October 24, 2019
Any plans/need to updated for GEDCOM 5.5.5?

By BlackAdder - Sunday, February 9, 2020
Hi Ron, times are a changing.   I have a same sex couple in my family tree.   Everything looks good until I get to the FAM block in the Gedcom where it only reports one of the individuals:
>>>>>
0 @fam00001@ FAM
1 WIFE @ind00002@
1 CHIL @ind00001@
1 MARR
2 TYPE Civil Marriage
<<<<<

The second FAMS person (ind00003) is not shown.   I tried this with two Females and with two Males just to be sure.

Are you able to update your "{EN} Export to Gedcom" plugin to cater for these situations which I'm sure will become more common place in time?

Thanks
By genome - Tuesday, February 11, 2020
Apologies for late response NiKo, I looked at the 5.5.5 spec when you posted but couldn't see anything I need to change to meet it, I.e. I consider the current output to be fully compatible so only possible change would be to change version in header. Unless anyone thinks differently.....

BlackAdder I shall investigate same sex and polygamous relationships. I suspect my script just looks for the one male and/or female.
By genome - Wednesday, February 12, 2020
Ok I have now looked into the Gedcom spec and it only allows one HUSB tag and/or one WIFE tag under the FAM record, however no gender is implied in the use of either tag. 

I have amended the script so that where there is more than one partner in a family relationship then HUSB and WIFE tags are generated for the first two partners. If the first partner is male then HUSB is used with WIFE for the second irrespective of gender. Similarly if the first partner is female then WIFE is used and HUSB for the second.

If there are more than 2 partners then a warning is generated.

The amended skin is attached, download and unzip into your skins folder to replace the existing Gedcom Export skin folder.
By Joyaa - Sunday, February 16, 2020
Hi Ron,


I am not a GenoPro user but have been sent a GenoPro-created gedcom file that I wish to append (with the owner's permission) to my comprehensive existing data.  The non-GEDCOM standard approach the GenoPro uses to label sources, and events (like an individual's occupation) appears to make the job very much harder.  

I am sure you will know this yourself, but by way of example for other readers:

My own data shows an individual's occupation in this way:

1 OCCU He had a photographic studio

In contrast, the GenoPro gedcom shows:

1 OCCUPATIONS @occu00048@, @occu00093@

and elsewhere in the file there is a key:

0 @occu00048@ Occupation
1 TITL He had a photographic studio
0 @occu00049@ Occupation
1 TITL ...etc....

Am I best to undertake some serious editing of the GenoPro gedcom file using NotePad++ or similar in order to render it GEDCOM-standard?
Or is there a routine I can use - or send to a GenoPro user to apply for me - to create a GEDCOM-standard gedcom from the file that I have been sent?
The Gedcom file I have been sent begins:  

0 HEAD
1 SOUR GenoPro
2 NAME GenoPro® - Picture Your Family Tree!(TM)
2 VERS 3.0.1.4
2 CORP GenoPro
2 ADDR http://www.genopro.com
1 DATE 27 JAN 2020
1 CHAR UTF-8
1 GEDC
2 VERS 5.5
2 FORM LINAGE-LINKED
etc...

It is 57,000 lines long.

Thanks, Joyaa
By genome - Sunday, February 16, 2020
Hi Joyaa

If you read the whole of this thread you will see that GenoPro users have two methods of creating a gedcom file.  You have been sent a 'gedcom' produced using GenoPro's built-in File/Export/Export to Gedcom method which as you have discovered does not produce a useful Gedcom file at all.

Ask your sender to produce the gedcom file using the Report skin {en} Export to Gedcom to generate a much more compatible Gedcom file. 
By Joyaa - Monday, February 17, 2020
Thanks Ron.  I have sent this to my cousin.  He replies:  

getting an error:
Error at line 242, position 8 (Code/Utils.js): Exception thrown and not caught Microsoft JScript runtime error 800A139E

What do you think is happening?  Thanks.
By genome - Tuesday, February 18, 2020
Well I have never known this error to occur before now but it suggests that the GenoPro XML is invalid i.e. syntactically incorrect.  GenoPro holds its data in a zipped XML data file, saved as .gno

The GenoPro Report Generator provides an interface allowing the report script to obtain the underlying XML text (ReportGenerator.Document.GetTextXml)  The gedcom script then tries to load it into a DOM using Microsoft XML parser which is when the error occurs. 

To help diagnose the cause of this previously unreported issue could you ask your cousin to use GenoPro's menu function File / Export / Export to XML to obtain the XML in a file for you. 

Then opening the XML file with IE or Chrome should flag the first invalid XML in the file.  Report any findings here, Also if possible email me a copy of the XML file by attaching to an email sent via the email button on the left of this post under my user id.
By Joyaa - Tuesday, February 18, 2020
Thank you Ron.

My cousin reports:  

"This page contains the following errors: error on line 18591 at column 7335: xmlParseCharRef: invalid xmlChar value 11Below is a rendering of the page up to the first error."

I will email you the XML file - and thanks.  Joyaa
By genome - Thursday, February 20, 2020
On further investigation of this issue I discovered that GenoPro allows characters to be stored in text that are actually invalid for XML files unless embedded within CDATA delimitiers.  Characters with decimal code values below 32, i.e.below the space character in the ASCII character set,  are 'control' characters. The only control characters allowed other than in CDATA sections in XML are 9 (tab) , 10 (linefeed) and 13 (carriage return).  The character causing the problem in Joyaa's case is decimal code 11, which is known as VT or Vertical Tab.  I presume this was imported via a gedcom file or copied and pasted from another source.

An error is flagged if XML data containing invalid characters is read/parsed using any standard XML parser. I have amended the script to remove invalid control characters before loading.  This allows a gedcom file to be produced but note that the invalid characters will still remain in the output .ged as the data for it is obtained directly via the GenoPro Report Generator API and not via the XML.  Joyaa's file also contained Tab characters which in fact are also illegal in Gedcom files. 

If the package you use to import the .ged file objects to these characters then I suggest you edit the .ged with a good text editor such as NotePad++  that allows find and replace for control characters via hexadecimal values.

To use the amended report skin download the attached zipped folder and unzip into your GenoPro Report Skins folder replacing the existing {EN} Export to Gedcom folder