Here are few lessons learnt from playing with the BDC on a project
1. Even though the BDC can crawl associated entities in your database schema, the search results will not show related entity instances for the item you searched.
2. In case of mapping 'complex' db schemas specially ones that contain m:n relationships between entities, sometimes it might be advisable to show un-normalized relationships via stored procedures or views, but this approach provides the following roadblocks
- Un-normalized views/data will often have composite keys to uniquely identify a single row. You will have to define separate identifiers for each component of your composite key.
- More often than not when there isn't an associated entity to the parent entity the columns representing that entity in the un-normalized entity will be null, including part of the composite key. This is an issue with the BDC. If a key or part of a key is null, the BDC ignores that entity instance, so you would need to pad your null keys, (in SQL use ISNULL or COALESCE).
- If you are using a Stored-Proc to provide your un-normalized view then you would be defining your MinGeneratedKey and MaxGeneratedKey as Stored-Proc parameters, which will be used in the 'where' section of the query. Please note; remember those padded keys, you would need to pad them in your Where clause as well because when SQL executes the "Where" component of the query, the values returned will be null and outside your MinGeneratedKey and MaxGeneratedKey range. (Where component is executed before your Select, so padding your null keys in the Select portion of query is not enough)
3. Moving on to incremental crawls, the only reference to creating them in MSDN or the OSS SDK is a slightly confusing note on this page. To implement incremental crawls on the BDC you would need the following
1. You would need some column on your table/view/or Stored-Proc to indicate the last modified time of that entity instance. Adding a timestamp column to your tables is the easier approach since it requires no change to any application logic to update the last modified time. Point to note is that if you are combining tables to create an un-normalized view then you would need to calculate the greater timestamp in SQL from your combined entity instances. Also note, if you are using a timestamp column, you would need to cast it to DateTime in SQL for it to be of any use to the BDC.
2. In your IdEnumerator method you would need to declare a type descriptor for this column in your return parameter like so.
<Parameter Direction="Return" Name="entityReturnParam">
<TypeDescriptor TypeName="System.Data.IDataReader, System.Data, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" Name="entityDataReader" IsCollection="true">
<TypeDescriptors>
<TypeDescriptor TypeName="System.Data.IDataRecord, System.Data, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" Name="entityDataRecord">
<TypeDescriptors>
<TypeDescriptor TypeName="System.Int32" IdentifierName="entityPK" Name="entityPK" />
<TypeDescriptor TypeName="System.DateTime" Name="timestamp" />
</TypeDescriptors>
</TypeDescriptor>
</TypeDescriptors>
</TypeDescriptor>
</Parameter>
3. You will need to declare a propperty on your entity that refers to the timestamp column defined above, but this propperty needs to be called __BDCLastModifiedTimestamp and should be of type string
<Entity EstimatedInstanceCount="0" Name="entityName">
<Properties>
<Property Name="Title" Type="System.String">entityName</Property>
<Property Name="__BdcLastModifiedTimestamp" Type="System.String">timestamp</Property>
</Properties>