Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I populate a parent and a child table from another table?

Tags:

sql-server

I have a table containing a list of words and associated wordForms. The typical data in the table looks like this below. Note some of the columns of WordForms end in , e.g. and some just end with the last wordform word

Id  Word       WordForms
1   abandon    abandoned, abandoning, abandonment, abandons    
2   abstract   abstraction, abstractions, abstractly, abstracts, e.g.

Here is the layout of the source data table:

CREATE TABLE [dbo].[TempWords]
(
    [Id]         INT            IDENTITY (1, 1) NOT NULL,
    [Word]       NVARCHAR (MAX) NOT NULL,
    [WordForms]  NVARCHAR (MAX) NULL,
)

I would like to use this data to populate two tables. I know about using SQL INSERT INTO but that will I think only help me with the one table. What I would like to do is to take the first Word, put that into the Words table and then separate the wordforms that are now divided up by a comma and put those into the WordForms Table.

CREATE TABLE [dbo].[Words]
(
    [WordId]     INT            IDENTITY (1, 1) NOT NULL,
    [Word]       NVARCHAR (MAX) NOT NULL
)

CREATE TABLE [dbo].[WordForms]
(
    [Id]         INT            IDENTITY (1, 1) NOT NULL,
    [WordId]     INT NOT NULL,
    [Text]  NVARCHAR (MAX) NULL,
)

Can anyone give me some tips as to how I can do this?

like image 667
Samantha J T Star Avatar asked Dec 05 '22 18:12

Samantha J T Star


2 Answers

First of all, you can create UDF function to separate CSV values.

CREATE FUNCTION dbo.fn_Split (
      @InputString                  VARCHAR(8000),
      @Delimiter                    VARCHAR(50)
)

RETURNS @Items TABLE (
      Item                          VARCHAR(8000)
)

AS
BEGIN
      IF @Delimiter = ' '
      BEGIN
            SET @Delimiter = ','
            SET @InputString = REPLACE(@InputString, ' ', @Delimiter)
      END

      IF (@Delimiter IS NULL OR @Delimiter = '')
            SET @Delimiter = ','

--INSERT INTO @Items VALUES (@Delimiter) -- Diagnostic
--INSERT INTO @Items VALUES (@InputString) -- Diagnostic

      DECLARE @Item                 VARCHAR(8000)
      DECLARE @ItemList       VARCHAR(8000)
      DECLARE @DelimIndex     INT

      SET @ItemList = @InputString
      SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0)
      WHILE (@DelimIndex != 0)
      BEGIN
            SET @Item = SUBSTRING(@ItemList, 0, @DelimIndex)
            INSERT INTO @Items VALUES (@Item)

            -- Set @ItemList = @ItemList minus one less item
            SET @ItemList = SUBSTRING(@ItemList, @DelimIndex+1, LEN(@ItemList)-@DelimIndex)
            SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0)
      END -- End WHILE

      IF @Item IS NOT NULL -- At least one delimiter was encountered in @InputString
      BEGIN
            SET @Item = @ItemList
            INSERT INTO @Items VALUES (@Item)
      END

      -- No delimiters were encountered in @InputString, so just return @InputString
      ELSE INSERT INTO @Items VALUES (@InputString)

      RETURN

END -- End Function
GO

Then you can use below INSERT statements to populate tables.

INSERT INTO [Words]
SELECT Word FROM TempWords


INSERT INTO WordForms

SELECT
    W.WordId,
    LTRIM(RTRIM(FNT.Item)) AS Item
FROM TempWords AS TW
INNER JOIN [Words] AS W
ON TW.[Word]=W.[Word]
CROSS APPLY fn_Split(REPLACE(TW.WordForms,', e.g.',''),',') AS FNT


SELECT * FROM [Words]
SELECT * FROM WordForms
like image 167
Sandesh Avatar answered May 18 '23 14:05

Sandesh


You can insert words into first table, than parse word forms and insert them into child table with link to parent table.

Link to parent table can be obtained by joining on word column (I guess it's unique) or by doing some MERGE + OUTPUT thing to obtain in one step SOURCE.ID (from @words_csv) and INSERTED.ID. Any way you like.

Parsing could be implemented in many ways too, check this example (actually I would not recommend to parse with sql at all).

DECLARE @words_csv TABLE (Id INT IDENTITY(1, 1), Word VARCHAR(100), WordForms VARCHAR(1000))

INSERT INTO @words_csv(word, wordforms)
VALUES
    ('abandon', 'abandoned, abandoning, abandonment, abandons, e.g.'),
    ('abstract', 'abstraction, abstractions, abstractly, abstracts')

--INSERT INTO [dbo].[Words](word)
--SELECT w.word
--FROM @words_csv w

;WITH word_forms_extracted AS
    (
        SELECT  w.id,
                w.word,
                ltrim(rtrim(cast(case when CHARINDEX(',', w.WordForms) > 0 then substring(w.wordforms, 1, CHARINDEX(',', w.WordForms)-1) end AS VARCHAR(1000)))) wordform,
                stuff(w.wordforms, 1, CHARINDEX(',', w.WordForms), '') wordforms
        FROM    @words_csv w

        UNION ALL

        SELECT  w.id,
                w.word,
                ltrim(rtrim(cast(case when CHARINDEX(',', wfe.WordForms) > 0 then substring(wfe.wordforms, 1, CHARINDEX(',', wfe.WordForms)-1) else wfe.wordforms end AS VARCHAR(1000)))) wordform,
                case when CHARINDEX(',', wfe.WordForms) > 0 then stuff(wfe.wordforms, 1, CHARINDEX(',', wfe.WordForms), '') ELSE '' end wordforms
        FROM    @words_csv w   
                INNER JOIN  word_forms_extracted wfe
                    ON  wfe.id = w.id
        WHERE wfe.wordforms != ''
    )
SELECT wf.id, wf.word, wf.wordform
FROM word_forms_extracted wf
--INNER JOIN [dbo].[Words] w
--ON w.word = wf.word
WHERE wf.wordform NOT IN ('', 'e.g.')
ORDER BY wf.id, wf.wordform
OPTION(MAXRECURSION 1000)

Final SELECT can be easily modified to INSERT INTO dbo.WordForms (...) SELECT ... Link to dbo.Words is obtained here as you can see by joining on word column.

like image 40
Ivan Starostin Avatar answered May 18 '23 14:05

Ivan Starostin