Montag, 16. Januar 2012

Dienstag, 10. Januar 2012

SharePoint Search – a look behind the scene Part I

Part I: SharePoint Search and BCS
(...which data is stored in the SharePoint Search databases?)

This look behind the scenes will be from the point of a DBA.
Scenario is:
A external LOB system (SQL Server database) is connected via a linked server. (environment restrictions).  A SQL Server view is used to get the data from the source LOB database. Based on the output of this View an External Content Type is defined with SharePoint Designer.
The connection to access the Report_helper Database (containing the View) is made using the logged on users identity.
(This brings a kind of interesting aspects which we will see later….)
Creating the External Content Type with SharePoint Designer results in a entry in the Business Data Connectivity Service. In Business Data Connectivity Service also a “Profile Page Location” is configured. This Results in a “Default Action” for the External Contend Type which shows all fields coming from the SQL Server View.
Now the BCS source is configured as a Content Source in SharePoint Search Administration. After a full crawl the data from the external LOB database can be found via SharePoint Search UI.
So far so good – that’s business as usual. Now let’s have a look at some further aspects.
The logged on user has access to the BCS Service from SharePoint and the user’s logon is also used to access the Report_helper database. The access to the LOB Database is made via a linked Server in the context of a technical user account. Interesting is now, that a user witch didn’t have access to the report_helper Database can find results from the external LOB system anyway.
If this user (with no access to the report_helper Database) now clicks on a result shown in the result set he receives an error: “Invalid Credentials. Please contact your system administrator.
Why is this so?!
To answer this lets have a look how the SharePoint Search Service works in this case.
With the SQL Server Profiler we can catch the query SharePoint generates:
exec sp_executesql
N'/* {ADA5C40F-A0AD-49CC-B7B4-4074D19DFA28} */

exec dbo.proc_MSS_GetMultipleResults @P1,@P2,@P3,@P4,@P5
',N'@P1 int,@P2
nvarchar(4000),@P3 nvarchar(4000),@P4 nvarchar(4000),
@P5 varbinary(8000)',1,N'Auto',N'',
N'SET NOCOUNT ON ; DECLARE @joinRows INT ; SET @joinRows = DATALENGTH(@joinData) / 8 ;;
WITH DocIds(DocId, Value) AS ( SELECT TOP(@joinRows) CAST(SUBSTRING(@joinData, ((ord.n-1)*8) + 1, 4)
AS INT), CAST(SUBSTRING(@joinData, ((ord.n-1)*8) + 5, 4) AS INT)
FROM dbo.MSSOrdinal AS ord WITH(NOLOCK) WHERE ord.n <= @joinRows )
SELECT P.DocId, P.SummaryBlobSize, P.Size, P.LastModified, P.IsDocument, P.IsPictureUrl, P.Author,
P.Title, P.Url, P.PictureThumbnailUrl, P.ContentClass, P.FileExtension, P.Tags,
P.PropertyBlob, CASE WHEN P.PropertyBlob IS NULL THEN 0
ELSE DATALENGTH(P.PropertyBlob) END, P.PopularSocialTags, P.SiteName, P.Description, P.ParentLink,
P.NumberOfMembers, P.PictureHeightAndWidth, P.DisplayDate FROM dbo.MSSDocResults AS P WITH(NOLOCK),
DocIds AS T WHERE P.DocId = T.DocId OPTION (MAXDOP 1) ',0x0000071000000000000006F5000000010000024C000000020000038E000000030000038D00000004000003EA00000005000003F000000006000003F6000000070000070B00000008

Cryptically, I know.  What happens is that SharePoint calls the StoredProcedure “dbo.proc_MSS_GetMultipleResults” with parameter:
This delivers the data displayed by the result webpart (More details on this in one of the next posts).There is no call for external data because of the overview data is stored in the managed property “HitHighlightedSummary”. This data is in the “PropertyBlob” filed coming with the result of the StoredProcedure
The “PropertyBlob” contains an array PropertyValues. This array contains values for managed property for an item. Also the managed property “HitHighlightedSummary” which contains a dynamic summary that has a collection of highlighted word pieces that best describe the result.
For more information’s about the ProprtyBlob field see: LINK
If the user clicks on a result SharePoint generates another SQL call against the report_helper database / the View to get the values for all fields of the External Content Type - and that’s the cause why the user witch has no access to this Database gets the error “Invalid Credentials. Please contact your system administrator.