Freitag, 24. Februar 2012

Mittwoch, 15. Februar 2012

SharePoint Search – a look behind the scene Part III

Part III: SharePoint Search and BCS
(...which data is stored in context of crawled and managed properties in the SharePoint Search databases?)
In the second post of this series we see that the metadata shown in the result-set is stored in the SQL Server, even the data is coming from an extern BCS source. The given metadata like “name”, “description” etc. is stored in the PropertyBlob field explained in the first post. But this Blob only contains data which are part of the managed property “HitHighlightedSummary”. What happened now if we add a crawled property coming from a BCS source to a managed property?
Situation:
We have an External Content Source called “LOB2”. This content source is connected to an SQL Server database called “MiCLAS_TEST”.
The External Content Source contains the table called “cis.Vorgang_z
If we now search for example the term “330114” witch is a value from the column “VorNummer” we get that result in SharePoint Search:
Catching the call with SQL Server Profiler (as described in Part II) and query it in SQL Server Management Studio it looks like this:
The PropertyBlob in this case contains the following data as varbinary:
0x020200EEFFFE7F99010000BD0010000080024078DAC590314BC36018841F639A96125408A50E0E19BA65B198A14310A4140A828242A0434B054104ADAE6EDF3F4F9FA420F41734C7BD977BBF5BDE6B9AA689801B4918CADE819C1387849833DAD97ED75D2671240106EEFB5CC8DAFFB44B24A2161BAE744FE24EDC3215A55B8EFC921FBE78E7931D1FE4CCF5DFFCF2A6FFD39F1A35CFBCF0C803AF62C1B6BB7FED88BD7AE08D11993D8D75B93A0999BC3CA89922A414E64ADF66763AB3DF7F7557D96565BFF766E78C58A97BBEC92717006452D3CBBACA8100805882AFCDCB81
If we know add crawled propertys to a managed propertys the PropertyBlob chaned1. So were are this informations stored used to generate the updated Property Blob? (The update happens while fullcrawling the content source) The answer is the table “dbo.MSSDocProps“ in the „Search_Service_Application_PropertyStoreDB“.

Lets have a look using the DocId 2056 from the call above:
(for a better undestrand I alsoadd the “FriendlyName” column from the table dbo.MSSManagedProperties to the query )

USE Search_Service_Application_PropertyStoreDB_b506dce49c514f8899ae51e503889885
SELECT v2.FriendlyName,
         v1.*
FROM dbo.MSSDocProps v1, Search_Service_Application_DB_dd13ba19a7bb4ffaafcc3e626e73c949.dbo.MSSManagedProperties v2
WHERE DocId = 2056
and v1.PID = v2.PID
Now I map some crawled property to managed properties:
After a full crawled I execute the query again:
In case of the “LOB2Date” and the “LOB2Cur” values you can see that the data is in the IIVal column. The “LOB2Bez” value is clear text in the strVal2 column. LOB2Date is a cryptic datetime value based on the datatype DateTime Structure. LOB2Cur is a decimal value.
1The updated PropertyBlob now contains the following:
0x020400EEFFFE7F9A0100009B0100009D010000BD0010000F000800008002400100014078DAC590314BC36018841F639A96125408A50E0E19BA65B198A14310A4140A828242A0434B054104ADAE6EDF3F4F9FA420F41734C7BD977BBF5BDE6B9AA689801B4918CADE819C1387849833DAD97ED75D2671240106EEFB5CC8DAFFB44B24A2161BAE744FE24EDC3215A55B8EFC921FBE78E7931D1FE4CCF5DFFCF2A6FFD39F1A35CFBCF0C803AF62C1B6BB7FED88BD7AE08D11993D8D75B93A0999BC3CA89922A414E64ADF66763AB3DF7F7557D96565BFF766E78C58A97BBEC92717006452D3CBBACA8100805882AFCDCB8148006F006C00640069006E006700200043006F006D00700061006E007900E07AB02600000000
So we see all the content contained in managed metadata is stored in the Search_Service_Application_PropertyStoreDB“ database, even the data is coming from an extern BCS source. This is intresting for some security issues  (there will be a separate post about this soon)and also for SQL Server maintainence and index defragmentaion. There is a very good article about this availibel on blogs.msdn: LINK
See the complete post inc. the hands-on lab as webcaste here:


Dienstag, 7. Februar 2012

SharePoint Search – a look behind the scene Part II

Part II: SharePoint Search and BCS
(...which data is stored in the SharePoint Search databases?)

Based on the first post “SharePoint Search – a look behind the scene Part I” we will know have a closer look to what happened in the SQL Server if a search request is generated (in this case against a BCS source).

In this example I used an already existing BCS source in my DEMO environment described here: LINK

In the SharePoint Search Center I used the search text “DVD”

With the SQL Server Profiler I can filter out the query SharePoint fires against the SQL Server:

exec sp_executesql
N'/* {00C1C222-BB31-408A-815F-C3A76E85C290} */
exec dbo.proc_MSS_GetMultipleResults @P1,@P2,@P3,@P4,@P5',
N'@P1 int,
@P2 nvarchar(4000),
@P3 nvarchar(4000),
@P4 nvarchar(4000),
@P5 varbinary(8000)',-2147483647,N'dvd',N'',N'SET NOCOUNT ON ;
DECLARE @joinRows INT ;
1.  SET @joinRows = DATALENGTH(@joinData) / 8 ;;
2.  WITH DocIds(DocId, Value) AS (
3.  SELECT TOP(@joinRows) CAST(SUBSTRING(@joinData,
((ord.n-1)*8) + 1, 4) AS INT), CAST(SUBSTRING(@joinData, ((ord.n-1)*8) + 5, 4) AS INT)
FROM dbo.MSSOrdinal AS ord WITH(NOLOCK) WHERE ord.n <= @joinRows )

    4. SELECT P.DocId, P.SummaryBlobSize, P.Size, P.LastModified, P.IsDocument, P.IsPictureUrl, P.Author, P.Title, P.Url,
P.PictureThumbnailUrl, P.ContentClass, P.FileExtension, P.Tags, P.PropertyBlob,
CASE WHEN P.PropertyBlob IS NULL THEN 0 ELSE DATALENGTH(P.PropertyBlob) END,
P.PopularSocialTags, P.SiteName, P.Description, P.ParentLink, P.NumberOfMembers, P.PictureHeightAndWidth, P.DisplayDate
FROM dbo.MSSDocResults AS P WITH(NOLOCK), DocIds AS T WHERE P.DocId = T.DocId OPTION (MAXDOP 1) ',0x00001592000000000000...700000031


So let’s see what exactly happens by disassemble that query call.
We can see that the Stored Procedure dbo.proc_MSS_GetMultipleResults is called. But we want to go just another step deeper and find out what’s behind this call. (The Stored Procedure dbo.proc_MSS_GetMultipleResults will be part of one of the next posts.)
First of all the DATALENGTH of the generated BLOB witch is stored in @joinData (details for the BLOB data can be found here: LINK) is used to set the variable @joinRows:
1.       SET @joinRows = DATALENGTH(@joinData) / 8
Let’s see what the result is using this query:
                SELECT DATALENGTH(0x000015920000…700000031) / 8

The result is “50

In the next step a temporary result-set called “DocIds” is created using the “WITH” SQL statement.
2.       WITH DocIds(DocId, Value) AS (…
The call generates a result-set looking like this:
In the first column we see the DocIDs we will need in the next step. But let’s have a look on how this result is generated. The point in the query is this one
1.       SELECT TOP(@joinRows)
CAST(SUBSTRING(@joinData,((ord.n-1)*8) + 1, 4) AS INT),
CAST(SUBSTRING(@joinData, ((ord.n-1)*8) + 5, AS INT)
FROM dbo.MSSOrdinal AS ord WITH(NOLOCK) WHERE ord.n <= @joinRows
Using the CAST(SUBSTRING(@joinData,((ord.n-1)*8) + 1, 4) AS INT)on the BLOB data stored in the @joinData will fillter out a list of item identifiers and their rank contained in an Id value pair described also here LINK. This is the only magic creating the result-set showen in step 2.
Next is an easy join using the tempory result-set “DocIds” and the contend of the table “dbo.MSSDocResults”. The join is done with the part "WHERE P.DocId = T.DocId" showen in the query below:
2.       SELECT P.DocId, P.SummaryBlobSize, P.Size, P.LastModified, P.IsDocument, P.IsPictureUrl, P.Author, P.Title, P.Url, P.PictureThumbnailUrl, P.ContentClass, P.FileExtension, P.Tags, P.PropertyBlob,
CASE WHEN P.PropertyBlob IS NULL
THEN 0
ELSE DATALENGTH(P.PropertyBlob)
END,
P.PopularSocialTags, P.SiteName, P.Description, P.ParentLink, P.NumberOfMembers, P.PictureHeightAndWidth, P.DisplayDate
FROM dbo.MSSDocResults AS P WITH(NOLOCK),
DocIds AS T WHERE P.DocId = T.DocId OPTION (MAXDOP 1)
The result is:
So we see that the results given back from SQL Server contains all the data needed even the data is coming from an extern BCS source. The given metadata like “name”, “description” etc. is stored in the PropertyBlog field explained in the first post.
If we now call the ProfilePage of the BCS source all data fields witch are configured are needed. This result in call against the data source defined in the BCS model. In my case this is also an SQL Server call because my External Data Source is a SQL Server:
Captured with SQL Profiler it looks like this:
exec sp_executesql N'SELECT [ProductKey] , [ProductName] , [ProductDescription] , [BrandName] , [ClassName] , [ColorName] , [UnitPrice] , [ProductSubcategoryName] , [ProductCategoryDescription] , [Expr1] FROM [dbo].[V_ContosoRetailDW_ProductSales] WHERE [ProductKey] = @ProductKey',N'@ProductKey int',@ProductKey=1642






and the result similar to what we can see at the ProfilePage:

See the complete post inc. the hands-on lab as webcaste here: